Gam models: summary
Research question
How did neonatal health evolve through time (2007-2020)?
Particularly, did it change during two important crisis: the 2009 economic crisis and the current COVID-19 pandemic?
We focus on the evolution of birthweight and gestational age, and on the rates of stillbirth, preterm birth, lowbirth weight and macrosomia babies.
COVID data will be included soon: hospital cases, mortality. Temperature will also maybe be included.
Data and Methods
Data
We used data from the Federal Office of Statistics: Bevnat. It covers:
All births that occurred in Switzerland
Between 1987 and 2020 (year 2021 will soon be included)
Variables
Swiss-SEP
Average Swiss neighbourhood index of socioeconomic position (Swiss-SEP 2.0). Swiss-SEP 2.0 indicates the average socioeconomic situation in a municipality. It was developed by the Institute for Social and Preventive Medicine at the University of Bern.
Exclusions steps
- As gestational age was only recorded systematically from 2007, we analysed data from this year only
- Steps of exclusions: see the ppt
Definitions:
Preterm birth babies : <37 weeks
Low birthweight babies : <2500g
Macrosomia babies: >= 4000g
“normal” birthweight babies: )2500-4000)
We work with two different datasets:
one that includes stillbirth cases to investigate stillbirth outcome: bevn_eco_in6.
one that excludes stillbirth cases to investigate birthweight/gestational age/ preterm birth/ low birth weight/ macrosomia outcomes : bevn_eco_in7.
Characteristics at inclusion and in two other datasets used
- see tables in html.
Methods
We built different GAM models, investigating the following outcomes:
birthweight as a continuous variable
stillbirth : binary yes/no variable
gestational age: as a continuous variable, in days
preterm birth: binary yes/no variable.
low birthweight: binary yes/no variable, comparing LBW babies vs the rest
low birthweight: binary yes/no variable, comparing LBW babies vs “normal” birthweight babies
macrosomia: binary yes/no variable, comparing macrosomia babies vs “normal” birthweight babies
In the models the variable “birth_Y_M_num” is used to account for time: it is numerical monthly time, with 1= January 2007 and 168=December 2020.
Birthweight
2007-2020
We chose this model based on lower AIC and also lower k value for maternal age, which seems more appropriate when looking at the graph.
m_BW_bam5_2 <- bam(BW ~ s(GA_weeks) + s(mat_age) + s(birth_Y_M_num) + s(month, bs="cc") + s(mean_ssep2) + parity_cat + sex + Urban3 + Language + mother_nationality_cat2, data=bevn_eco_in7)
summary(m_BW_bam5_2)##
## Family: gaussian
## Link function: identity
##
## Formula:
## BW ~ s(GA_weeks) + s(mat_age) + s(birth_Y_M_num) + s(month, bs = "cc") +
## s(mean_ssep2) + parity_cat + sex + Urban3 + Language + mother_nationality_cat2
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3332.5436 0.8044 4142.958 < 2e-16 ***
## parity_cat(1,2] 127.6480 0.8036 158.855 < 2e-16 ***
## parity_cat(2,3] 173.9594 1.2289 141.551 < 2e-16 ***
## parity_cat(3,20] 200.6676 2.1315 94.142 < 2e-16 ***
## sex2 -146.8799 0.7183 -204.479 < 2e-16 ***
## Urban3 0.6419 0.7614 0.843 0.399
## LanguageFrench -47.9477 0.8834 -54.278 < 2e-16 ***
## LanguageItalian -62.4453 1.9850 -31.458 < 2e-16 ***
## mother_nationality_cat2Africa 9.4983 2.1839 4.349 1.37e-05 ***
## mother_nationality_cat2Asia -18.2732 1.9793 -9.232 < 2e-16 ***
## mother_nationality_cat2Europe 31.9507 0.8178 39.067 < 2e-16 ***
## mother_nationality_cat2Northern America 66.3642 4.6996 14.121 < 2e-16 ***
## mother_nationality_cat2Southern and Central America 46.1781 2.7101 17.039 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(GA_weeks) 8.880 8.994 78634.633 < 2e-16 ***
## s(mat_age) 6.004 6.912 22.263 < 2e-16 ***
## s(birth_Y_M_num) 8.451 8.915 37.474 < 2e-16 ***
## s(month) 5.044 8.000 3.793 4.51e-07 ***
## s(mean_ssep2) 7.456 8.418 45.986 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.4 Deviance explained = 40%
## fREML = 8.7483e+06 Scale est. = 1.5252e+05 n = 1184367
gam.check(m_BW_bam5_2) ##
## Method: fREML Optimizer: perf newton
## full convergence after 10 iterations.
## Gradient range [-0.02814184,0.02294761]
## (score 8748341 & scale 152523.8).
## Hessian positive definite, eigenvalue range [2.059027,592175].
## Model rank = 57 / 57
##
## Basis dimension (k) checking results. Low p-value (k-index<1) may
## indicate that k is too low, especially if edf is close to k'.
##
## k' edf k-index p-value
## s(GA_weeks) 9.00 8.88 1.02 0.92
## s(mat_age) 9.00 6.00 1.01 0.84
## s(birth_Y_M_num) 9.00 8.45 1.00 0.56
## s(month) 8.00 5.04 1.00 0.44
## s(mean_ssep2) 9.00 7.46 1.02 0.90
plot(m_BW_bam5_2, select=1, ylim=c(0, 4000),shade = TRUE, shade.col = "lightblue", shift = coef(m_BW_bam5_2)[1]) plot(m_BW_bam5_2, select=2, ylim=c(3200, 3400),shade = TRUE, shade.col = "lightblue", shift = coef(m_BW_bam5_2)[1]) plot(m_BW_bam5_2, select=3, ylim=c(3300, 3380),shade = TRUE, shade.col = "lightblue", shift = coef(m_BW_bam5_2)[1], xlab = "1 : 2007-01, 25: 2009-01, 50: 2011-02, 75: 2013-03, 100: 2015-04, 125: 2017-05, 150: 2019-06") plot(m_BW_bam5_2, select=4, ylim=c(3325, 3345),shade = TRUE, shade.col = "lightblue", shift = coef(m_BW_bam5_2)[1]) plot(m_BW_bam5_2, select=5, ylim=c(3200, 3400),shade = TRUE, shade.col = "lightblue", shift = coef(m_BW_bam5_2)[1])To take into account the fact that we are analysing a huge dataset (n=1,099,368), we calculated Cohen’s d. (calculation and interpreation of d guide
)
Cohen’s d formula:
\(d=\frac{beta}{SD}\) and \(SD=\sqrt{n}*se\)
\(d=\frac{beta}{\sqrt{n}*se}\)
- beta: estimate
- se: standard error
- n: population/observation size
- SD: standard deviation
#getting the CI 95% and d (interpret only for the non-smooth parameters)
beta <- coef(m_BW_bam5_2)
Vb <- vcov(m_BW_bam5_2, unconditional = TRUE)
se <- sqrt(diag(Vb))
n <- 1099328
d <- (abs(beta)/(sqrt(n)*se))
my.ci <- data.frame(cbind(beta, lci = beta-1.96*se, uci = beta + 1.96*se, d))
round(head(my.ci,13), 2)## beta lci uci d
## (Intercept) 3332.54 3330.97 3334.12 3.95
## parity_cat(1,2] 127.65 126.07 129.22 0.15
## parity_cat(2,3] 173.96 171.55 176.37 0.14
## parity_cat(3,20] 200.67 196.49 204.85 0.09
## sex2 -146.88 -148.29 -145.47 0.20
## Urban3 0.64 -0.85 2.14 0.00
## LanguageFrench -47.95 -49.68 -46.22 0.05
## LanguageItalian -62.45 -66.34 -58.55 0.03
## mother_nationality_cat2Africa 9.50 5.22 13.78 0.00
## mother_nationality_cat2Asia -18.27 -22.15 -14.39 0.01
## mother_nationality_cat2Europe 31.95 30.35 33.55 0.04
## mother_nationality_cat2Northern America 66.36 57.15 75.58 0.01
## mother_nationality_cat2Southern and Central America 46.18 40.87 51.49 0.02
Interpretation of m_BW_bam5_2 model :
- Non-smooth terms
Increasing parity increases birthweight compared to first births: from (126g, 129g) at second parity, (171g, 176g) at third parity, to (195g, 204g) for parities over 3.
Being born a female decreases birthweight by (-148g, -146g) compared to males.
Urbanicity seems to have no effect on birthweight (it is worth mentioning that it did have an effect as long as mean_ssep variable was not included in the model)
Language: Compared to babies born in German(+Romansh)-speaking Switzerland, being born in the French-speaking part decreases BW by (-50g,-46g), while being born in the Italian-speaking part decreases BW by (-67g,-59g).
Maternal nationality: compared to babies born from Swiss mothers, babies born from Asian mothers are lighter (-22g, -14g). Babies born from other-nationalities mothers are heavier: (7g, 16g) for Africa, (31g, 34g) for other European countries, (55g, 75g) for Northern America, (42g, 86g) for Oceania, and (42g-53g) for Southern/Central America. –> we should discuss the biological relevance of these changes, because variations of around 50g probably don’t impact much neonatal health.
Interpretation of Cohen’s d parameters: we should not overinterpret these values. Usually it is considered that when d is<0.2, the effect should be considered very small. Here as all our ds are<0.2, it suggests that we may not see all these effects in much smaller datasets.
- Smooth terms
Gestational age: Obviously, BW increases with increasing GA, with a plateau at around 800g for babies between 20 and 25 gestational weeks, and another plateau at around 4000g for babies above 45 weeks.
Maternal age: BW increases non-linearly with maternal age: for younger mothers (<25yo) it increases almost linearly from 3280g to 3350g, then declines very slightly until reaching a plateau around 3340g when the mother is 25-40yo. Older women have heavier babies, increasing from 3340g to around 3360g for women 40-50yo (keep in mind we have excluded a few mothers above 50).
Time (month+year combined): globally, BW is decreasing between 2007 and 2019 from 3350 to 3330 which is a minor change, and then increases from 2019 to reach 3350g by the end of 2020. There are some other minor variations throughtout the whole time but this doesnt seem very relevant: only 20g variation.
Seasonal/month: minor sinusoidal seasonal variation between 3330-3340g, with lower birthweight in March and August. These difference dont seem relevant either (10g variation will probably not impact the baby’s health).
Mean ssep: Birthweight increases by increasing ssep, linearly between ssep=25-45 approx (3250-3340g), and then there is some minor birthweight increase for higher ssep, until reaching a BW of around 3360g. Overall, a +100g increase is probably worth mentioning.
2008-2010
Because birth_Y_M_num and month variables correlated too much, month variable was not kept in the model. This was done in every “sub” model including only the years 2008-2010 or 2017-2020.
Mean ssep was set as linear variables (when included as a smooth, the graph clearly showed a linear association).
m_BW_bam_2009_2 <- bam(BW ~ s(GA_weeks) + s(mat_age) + s(birth_Y_M_num) + mean_ssep2 + parity_cat + sex + Urban3 + Language + mother_nationality_cat2, data=bevn_eco_in7_2008_10)
summary(m_BW_bam_2009_2)##
## Family: gaussian
## Link function: identity
##
## Formula:
## BW ~ s(GA_weeks) + s(mat_age) + s(birth_Y_M_num) + mean_ssep2 +
## parity_cat + sex + Urban3 + Language + mother_nationality_cat2
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3275.9349 6.4043 511.519 < 2e-16 ***
## mean_ssep2 0.8782 0.1044 8.411 < 2e-16 ***
## parity_cat(1,2] 122.2189 1.8859 64.807 < 2e-16 ***
## parity_cat(2,3] 172.4210 2.8854 59.757 < 2e-16 ***
## parity_cat(3,20] 196.6754 4.9252 39.933 < 2e-16 ***
## sex2 -146.2862 1.6810 -87.023 < 2e-16 ***
## Urban3 0.7995 1.7458 0.458 0.64697
## LanguageFrench -46.0119 2.0624 -22.310 < 2e-16 ***
## LanguageItalian -64.9937 4.3480 -14.948 < 2e-16 ***
## mother_nationality_cat2Africa 41.3021 6.0380 6.840 7.92e-12 ***
## mother_nationality_cat2Asia -15.2229 4.9414 -3.081 0.00207 **
## mother_nationality_cat2Europe 36.4757 1.9398 18.804 < 2e-16 ***
## mother_nationality_cat2Northern America 82.8969 10.6242 7.803 6.09e-15 ***
## mother_nationality_cat2Southern and Central America 60.7497 6.0059 10.115 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(GA_weeks) 8.049 8.697 14885.61 < 2e-16 ***
## s(mat_age) 3.356 4.180 6.88 1.01e-05 ***
## s(birth_Y_M_num) 2.819 3.509 15.17 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.397 Deviance explained = 39.7%
## fREML = 1.6233e+06 Scale est. = 1.5481e+05 n = 219541
plot(m_BW_bam_2009_2, select=1, ylim=c(0, 4000),shade = TRUE, shade.col = "lightblue",shift = coef(m_BW_bam_2009_2)[1]) #GA plot(m_BW_bam_2009_2, select=2, ylim=c(3200, 3350),shade = TRUE, shade.col = "lightblue",shift = coef(m_BW_bam_2009_2)[1]) #mat age plot(m_BW_bam_2009_2, select=3, ylim=c(3250, 3350),shade = TRUE, shade.col = "lightblue", shift = coef(m_BW_bam_2009_2)[1], xlab = "13 : 2008-01, 20: 2008-08, 30: 2009-06, 40: 2010-04, 50: 2011-02") # time gam.check(m_BW_bam_2009_2)##
## Method: fREML Optimizer: perf newton
## full convergence after 9 iterations.
## Gradient range [-0.002199328,0.001611336]
## (score 1623254 & scale 154814.9).
## Hessian positive definite, eigenvalue range [0.2938357,109762].
## Model rank = 41 / 41
##
## Basis dimension (k) checking results. Low p-value (k-index<1) may
## indicate that k is too low, especially if edf is close to k'.
##
## k' edf k-index p-value
## s(GA_weeks) 9.00 8.05 1.02 0.880
## s(mat_age) 9.00 3.36 0.98 0.085 .
## s(birth_Y_M_num) 9.00 2.82 0.99 0.350
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
concurvity(m_BW_bam_2009_2, full=TRUE) #values are now fine.## para s(GA_weeks) s(mat_age) s(birth_Y_M_num)
## worst 0.9828194 0.015512619 0.13693385 0.005543902
## observed 0.9828194 0.006735672 0.02785567 0.005139984
## estimate 0.9828194 0.009290297 0.08864274 0.003183070
# Summary of estimates, CI95% and Cohen's d
beta <- coef(m_BW_bam_2009_2)
Vb <- vcov(m_BW_bam_2009_2, unconditional = TRUE)
se <- sqrt(diag(Vb))
n <- 219790
d <- (abs(beta)/(sqrt(n)*se))
my.ci <- data.frame(cbind(beta, lci = beta-1.96*se, uci = beta + 1.96*se, d))
round(head(my.ci,14), 2) ## beta lci uci d
## (Intercept) 3275.93 3263.38 3288.49 1.09
## mean_ssep2 0.88 0.67 1.08 0.02
## parity_cat(1,2] 122.22 118.52 125.92 0.14
## parity_cat(2,3] 172.42 166.77 178.08 0.13
## parity_cat(3,20] 196.68 187.02 206.33 0.09
## sex2 -146.29 -149.58 -142.99 0.19
## Urban3 0.80 -2.62 4.22 0.00
## LanguageFrench -46.01 -50.06 -41.97 0.05
## LanguageItalian -64.99 -73.52 -56.47 0.03
## mother_nationality_cat2Africa 41.30 29.47 53.14 0.01
## mother_nationality_cat2Asia -15.22 -24.91 -5.54 0.01
## mother_nationality_cat2Europe 36.48 32.67 40.28 0.04
## mother_nationality_cat2Northern America 82.90 62.07 103.72 0.02
## mother_nationality_cat2Southern and Central America 60.75 48.98 72.52 0.02
2017-2020
We also did not use the month variable because of correlation between month and time (birth_Y_M_num) variables.
m_BW_bam_covid_2 <- bam(BW ~ s(GA_weeks) + s(mat_age) + s(birth_Y_M_num) + s(mean_ssep2) + parity_cat + sex + Urban3 + Language + mother_nationality_cat2, data=bevn_eco_in7_2017_20)
summary(m_BW_bam_covid_2)##
## Family: gaussian
## Link function: identity
##
## Formula:
## BW ~ s(GA_weeks) + s(mat_age) + s(birth_Y_M_num) + s(mean_ssep2) +
## parity_cat + sex + Urban3 + Language + mother_nationality_cat2
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3332.96317 1.35017 2468.559 < 2e-16 ***
## parity_cat(1,2] 132.00708 1.34064 98.466 < 2e-16 ***
## parity_cat(2,3] 180.27876 2.04234 88.271 < 2e-16 ***
## parity_cat(3,20] 204.30916 3.56285 57.344 < 2e-16 ***
## sex2 -144.71279 1.20112 -120.481 < 2e-16 ***
## Urban3 0.03044 1.26785 0.024 0.9808
## LanguageFrench -50.00779 1.45563 -34.355 < 2e-16 ***
## LanguageItalian -60.40064 3.53628 -17.080 < 2e-16 ***
## mother_nationality_cat2Africa -7.07553 3.39816 -2.082 0.0373 *
## mother_nationality_cat2Asia -17.70861 3.17747 -5.573 2.50e-08 ***
## mother_nationality_cat2Europe 26.69512 1.35725 19.669 < 2e-16 ***
## mother_nationality_cat2Northern America 64.96064 8.13490 7.985 1.40e-15 ***
## mother_nationality_cat2Southern and Central America 37.53444 4.80963 7.804 6.01e-15 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(GA_weeks) 8.690 8.964 27609.45 <2e-16 ***
## s(mat_age) 6.052 6.959 28.09 <2e-16 ***
## s(birth_Y_M_num) 6.877 7.972 11.30 <2e-16 ***
## s(mean_ssep2) 4.608 5.723 19.23 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.398 Deviance explained = 39.8%
## fREML = 3.0905e+06 Scale est. = 1.5076e+05 n = 418723
plot(m_BW_bam_covid_2, select=1, ylim=c(0, 4000),shade = TRUE, shade.col = "lightblue",shift = coef(m_BW_bam_covid_2)[1]) plot(m_BW_bam_covid_2, select=2, ylim=c(3200, 3400),shade = TRUE, shade.col = "lightblue",shift = coef(m_BW_bam_covid_2)[1]) plot(m_BW_bam_covid_2, select=3, ylim=c(3310, 3360),shade = TRUE, shade.col = "lightblue", shift = coef(m_BW_bam_covid_2)[1],xlab = "120 : 2016-12, 130: 2017-10, 140: 2018-08, 150: 2019-06, 160: 2020-04") plot(m_BW_bam_covid_2, select=4, ylim=c(3200, 3400),shade = TRUE, shade.col = "lightblue", shift = coef(m_BW_bam_covid_2)[1]) gam.check(m_BW_bam_covid_2)##
## Method: fREML Optimizer: perf newton
## full convergence after 10 iterations.
## Gradient range [-9.060118e-05,5.371238e-05]
## (score 3090455 & scale 150755.3).
## Hessian positive definite, eigenvalue range [1.292255,209353].
## Model rank = 49 / 49
##
## Basis dimension (k) checking results. Low p-value (k-index<1) may
## indicate that k is too low, especially if edf is close to k'.
##
## k' edf k-index p-value
## s(GA_weeks) 9.00 8.69 1.00 0.54
## s(mat_age) 9.00 6.05 0.98 0.11
## s(birth_Y_M_num) 9.00 6.88 1.00 0.38
## s(mean_ssep2) 9.00 4.61 1.01 0.66
concurvity(m_BW_bam_covid_2, full=TRUE) #values are now fine.## para s(GA_weeks) s(mat_age) s(birth_Y_M_num) s(mean_ssep2)
## worst 0.802818 0.018709586 0.10728719 0.0021771575 0.17670014
## observed 0.802818 0.007301761 0.07036983 0.0008501963 0.10533841
## estimate 0.802818 0.011129202 0.06866282 0.0017502403 0.08589876
# Summary of estimates, CI95% and Cohen's d
beta <- coef(m_BW_bam_covid_2)
Vb <- vcov(m_BW_bam_covid_2, unconditional = TRUE)
se <- sqrt(diag(Vb))
n <- 332752
my.ci <- data.frame(cbind(beta, lci = beta-1.96*se, uci = beta + 1.96*se, d))## Warning in cbind(beta, lci = beta - 1.96 * se, uci = beta + 1.96 * se, d): number of rows of result is not a
## multiple of vector length (arg 4)
round(head(my.ci,13), 2)## beta lci uci d
## (Intercept) 3332.96 3330.32 3335.61 1.09
## parity_cat(1,2] 132.01 129.38 134.63 0.02
## parity_cat(2,3] 180.28 176.28 184.28 0.14
## parity_cat(3,20] 204.31 197.33 211.29 0.13
## sex2 -144.71 -147.07 -142.36 0.09
## Urban3 0.03 -2.46 2.52 0.19
## LanguageFrench -50.01 -52.86 -47.15 0.00
## LanguageItalian -60.40 -67.33 -53.47 0.05
## mother_nationality_cat2Africa -7.08 -13.74 -0.42 0.03
## mother_nationality_cat2Asia -17.71 -23.94 -11.48 0.01
## mother_nationality_cat2Europe 26.70 24.03 29.36 0.01
## mother_nationality_cat2Northern America 64.96 49.02 80.91 0.04
## mother_nationality_cat2Southern and Central America 37.53 28.11 46.96 0.02
Stillbirth
2007-2020
m_SB_bam2 <- bam(stillbirth~s(GA_weeks)+ s(mat_age) + s(month, bs="cc") + birth_Y_M_num + mean_ssep2 + Urban3 + Language + mother_nationality_cat2,family="binomial", data=bevn_eco_in6)
summary(m_SB_bam2)##
## Family: binomial
## Link function: logit
##
## Formula:
## stillbirth ~ s(GA_weeks) + s(mat_age) + s(month, bs = "cc") +
## birth_Y_M_num + mean_ssep2 + Urban3 + Language + mother_nationality_cat2
##
## Parametric coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -6.6210239 0.1397128 -47.390 < 2e-16 ***
## birth_Y_M_num 0.0004291 0.0003541 1.212 0.22554
## mean_ssep2 -0.0062984 0.0022377 -2.815 0.00488 **
## Urban3 0.0409810 0.0373237 1.098 0.27221
## LanguageFrench 0.0945006 0.0420954 2.245 0.02477 *
## LanguageItalian 0.0151122 0.0982534 0.154 0.87776
## mother_nationality_cat2Africa 0.1969551 0.0900981 2.186 0.02882 *
## mother_nationality_cat2Asia -0.0410686 0.0974290 -0.422 0.67337
## mother_nationality_cat2Europe 0.0277987 0.0407287 0.683 0.49490
## mother_nationality_cat2Northern America -0.3562366 0.2753805 -1.294 0.19580
## mother_nationality_cat2Southern and Central America -0.1540360 0.1318382 -1.168 0.24266
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df Chi.sq p-value
## s(GA_weeks) 7.609 8.140 18195.921 <2e-16 ***
## s(mat_age) 2.920 3.687 9.798 0.0348 *
## s(month) 1.680 8.000 5.208 0.0289 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.133 Deviance explained = 32.2%
## fREML = 1.6878e+06 Scale est. = 1 n = 1187945
plot(m_SB_bam2, trans = plogis, select=1, shift = coef(m_SB_bam2)[1], shade = TRUE, shade.col = "lightblue") #GA plot(m_SB_bam2, trans = plogis, select=2, ylim=c(0, 0.003), shift = coef(m_SB_bam2)[1], shade = TRUE, shade.col = "lightblue") #mat age plot(m_SB_bam2, trans = plogis, select=3, ylim=c(0.0005, 0.0015), shift = coef(m_SB_bam2)[1], shade = TRUE, shade.col = "lightblue") #month k.check(m_SB_bam2) ## k' edf k-index p-value
## s(GA_weeks) 9 7.608728 0.9057528 0.9725
## s(mat_age) 9 2.920139 0.8907410 0.6925
## s(month) 8 1.680357 0.8860355 0.4200
concurvity(m_SB_bam2)## para s(GA_weeks) s(mat_age) s(month)
## worst 0.9829397 0.010813136 0.05324318 0.0024520811
## observed 0.9829397 0.005278557 0.01291860 0.0003163135
## estimate 0.9829397 0.007031887 0.03568136 0.0003725596
Calculation of Cohen’s d for OR:
\(d= \log(OR) * \frac{\sqrt{3}}{\pi}\)
\(d = \frac{log(OR)}{1.81}\)
#table to show OR, confidence interval and Cohen's d
OR <- exp(coef(m_SB_bam2))
beta <- coef(m_SB_bam2)
Vb <- vcov(m_SB_bam2, unconditional = TRUE)
se <- sqrt(diag(Vb))
lci <- exp(beta-1.96*se)
uci <- exp(beta+1.96*se)
d <- abs(beta)/1.81
my.ci <- data.frame(cbind(OR, lci, uci, d))
round(head(my.ci, 11), 2)## OR lci uci d
## (Intercept) 0.00 0.00 0.00 3.66
## birth_Y_M_num 1.00 1.00 1.00 0.00
## mean_ssep2 0.99 0.99 1.00 0.00
## Urban3 1.04 0.97 1.12 0.02
## LanguageFrench 1.10 1.01 1.19 0.05
## LanguageItalian 1.02 0.84 1.23 0.01
## mother_nationality_cat2Africa 1.22 1.02 1.45 0.11
## mother_nationality_cat2Asia 0.96 0.79 1.16 0.02
## mother_nationality_cat2Europe 1.03 0.95 1.11 0.02
## mother_nationality_cat2Northern America 0.70 0.41 1.20 0.20
## mother_nationality_cat2Southern and Central America 0.86 0.66 1.11 0.09
Time variables (birth_Y_M_num) and mean ssep variable were put as linear terms, because the model including them as smoothed terms showed that the relationship between these and stillbirth was linear.
#Adding BW –> discussing this.
m_SB_bam3 = update(m_SB_bam2, . ~ . + s(BW))
plot(m_SB_bam3, trans = plogis, select=4, ylim=c(0, 1), shift = coef(m_SB_bam3)[1], shade = TRUE, shade.col = "lightblue") #BW with whole CI plot(m_SB_bam3, trans = plogis, select=4, ylim=c(0, 0.15), shift = coef(m_SB_bam3)[1], shade = TRUE, shade.col = "lightblue") #BW concurvity(m_SB_bam3)## para s(GA_weeks) s(mat_age) s(month) s(BW)
## worst 0.982946 0.8785980 0.05468627 0.0024631678 0.8786120
## observed 0.982946 0.4687422 0.02566686 0.0003148719 0.5110680
## estimate 0.982946 0.4251104 0.03677265 0.0003857912 0.3311017
k.check(m_SB_bam3)## k' edf k-index p-value
## s(GA_weeks) 9 7.319197 1.0186709 1.000
## s(mat_age) 9 2.323332 0.9086021 0.030
## s(month) 8 1.762571 0.9486990 0.675
## s(BW) 9 7.255290 0.8887452 0.010
When we add birthweight, there is too much correlation betwwen gestational age and birthweight (concurvity shows 0.88 which is to close to 1): it is not a good idea to include BW in the model. But we can have a look at the smoothed curve for stillbirth depending on birthweight: it has a clear U shaped, with high stillbirth risk (2-11%) for very low birthweight babies (<2kg), and moderately high stillbirth risk (5-8%) for very high birthweight babies (>8kg).
2008-2010
m_SB_bam_2009_2<-bam(stillbirth~s(GA_weeks)+ s(mat_age) + birth_Y_M_num + mean_ssep2 + Urban3 + Language + mother_nationality_cat2,family="binomial", data=bevn_eco_in6_2008_10)
summary(m_SB_bam_2009_2)##
## Family: binomial
## Link function: logit
##
## Formula:
## stillbirth ~ s(GA_weeks) + s(mat_age) + birth_Y_M_num + mean_ssep2 +
## Urban3 + Language + mother_nationality_cat2
##
## Parametric coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -6.410283 0.339284 -18.894 <2e-16 ***
## birth_Y_M_num -0.003885 0.004052 -0.959 0.3377
## mean_ssep2 -0.007011 0.005230 -1.341 0.1801
## Urban3 0.174868 0.086720 2.016 0.0438 *
## LanguageFrench 0.034067 0.100419 0.339 0.7344
## LanguageItalian 0.043316 0.208716 0.208 0.8356
## mother_nationality_cat2Africa -0.307759 0.262914 -1.171 0.2418
## mother_nationality_cat2Asia -0.622698 0.292782 -2.127 0.0334 *
## mother_nationality_cat2Europe 0.065845 0.093735 0.702 0.4824
## mother_nationality_cat2Northern America -0.831244 0.740833 -1.122 0.2618
## mother_nationality_cat2Southern and Central America -0.319156 0.314464 -1.015 0.3101
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df Chi.sq p-value
## s(GA_weeks) 5.928 6.942 3434.773 <2e-16 ***
## s(mat_age) 2.098 2.664 3.117 0.32
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.149 Deviance explained = 32.4%
## fREML = 3.1219e+05 Scale est. = 1 n = 220215
plot(m_SB_bam_2009_2, trans = plogis, select=1, ylim=c(0, 0.9) ,shift = coef(m_SB_bam_2009_2)[1], shade = TRUE, shade.col = "lightblue") #GA plot(m_SB_bam_2009_2, trans = plogis, select=2, ylim=c(0, 0.005), shift = coef(m_SB_bam_2009_2)[1], shade = TRUE, shade.col = "lightblue") #mat age k.check(m_SB_bam_2009_2)## k' edf k-index p-value
## s(GA_weeks) 9 5.928347 0.9184041 0.0475
## s(mat_age) 9 2.098210 0.9557733 0.8625
concurvity(m_SB_bam_2009_2)## para s(GA_weeks) s(mat_age)
## worst 0.9840637 0.010335562 0.060819741
## observed 0.9840637 0.005510772 0.005650732
## estimate 0.9840637 0.006763660 0.037289301
OR <- exp(coef(m_SB_bam_2009_2))
beta <- coef(m_SB_bam_2009_2)
Vb <- vcov(m_SB_bam_2009_2, unconditional = TRUE)
se <- sqrt(diag(Vb))
lci <- exp(beta-1.96*se)
uci <- exp(beta+1.96*se)
d <- abs(beta)/1.81
my.ci <- data.frame(cbind(OR, lci, uci, d))
round(head(my.ci, 11), 2)## OR lci uci d
## (Intercept) 0.00 0.00 0.00 3.54
## birth_Y_M_num 1.00 0.99 1.00 0.00
## mean_ssep2 0.99 0.98 1.00 0.00
## Urban3 1.19 1.00 1.41 0.10
## LanguageFrench 1.03 0.85 1.26 0.02
## LanguageItalian 1.04 0.69 1.57 0.02
## mother_nationality_cat2Africa 0.74 0.44 1.23 0.17
## mother_nationality_cat2Asia 0.54 0.30 0.95 0.34
## mother_nationality_cat2Europe 1.07 0.89 1.28 0.04
## mother_nationality_cat2Northern America 0.44 0.10 1.86 0.46
## mother_nationality_cat2Southern and Central America 0.73 0.39 1.35 0.18
As for the previous model (including all timing 2007-2020), we kept time(birth_Y_M_num) and mean_ssep2 as linear terms.
2017-2020
m_SB_bam_covid_2 <-bam(stillbirth~s(GA_weeks)+ mat_age + birth_Y_M_num + mean_ssep2 + Urban3 + Language + mother_nationality_cat2,family="binomial", data=bevn_eco_in6_2017_20)
summary(m_SB_bam_covid_2)##
## Family: binomial
## Link function: logit
##
## Formula:
## stillbirth ~ s(GA_weeks) + mat_age + birth_Y_M_num + mean_ssep2 +
## Urban3 + Language + mother_nationality_cat2
##
## Parametric coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -7.302194 0.375487 -19.447 < 2e-16 ***
## mat_age 0.007155 0.006086 1.176 0.23971
## birth_Y_M_num 0.001604 0.001730 0.927 0.35391
## mean_ssep2 -0.002605 0.003757 -0.693 0.48805
## Urban3 -0.083645 0.062767 -1.333 0.18265
## LanguageFrench 0.191494 0.069030 2.774 0.00554 **
## LanguageItalian -0.059268 0.191718 -0.309 0.75721
## mother_nationality_cat2Africa 0.417387 0.136384 3.060 0.00221 **
## mother_nationality_cat2Asia 0.133490 0.150175 0.889 0.37406
## mother_nationality_cat2Europe 0.009960 0.068610 0.145 0.88458
## mother_nationality_cat2Northern America 0.129843 0.408030 0.318 0.75032
## mother_nationality_cat2Southern and Central America -0.144730 0.224495 -0.645 0.51913
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df Chi.sq p-value
## s(GA_weeks) 7.355 8.009 6408 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.128 Deviance explained = 33%
## fREML = 5.9618e+05 Scale est. = 1 n = 419983
plot(m_SB_bam_covid_2, trans = plogis, select=1, ylim=c(0, 1) ,shift = coef(m_SB_bam_covid_2)[1], shade = TRUE, shade.col = "lightblue") #GA OR <- exp(coef(m_SB_bam_covid_2))
beta <- coef(m_SB_bam_covid_2)
Vb <- vcov(m_SB_bam_covid_2, unconditional = TRUE)
se <- sqrt(diag(Vb))
lci <- exp(beta-1.96*se)
uci <- exp(beta+1.96*se)
d <- abs(beta)/1.81
my.ci <- data.frame(cbind(OR, lci, uci, d))
round(head(my.ci, 12), 2)## OR lci uci d
## (Intercept) 0.00 0.00 0.00 4.03
## mat_age 1.01 1.00 1.02 0.00
## birth_Y_M_num 1.00 1.00 1.01 0.00
## mean_ssep2 1.00 0.99 1.00 0.00
## Urban3 0.92 0.81 1.04 0.05
## LanguageFrench 1.21 1.06 1.39 0.11
## LanguageItalian 0.94 0.65 1.37 0.03
## mother_nationality_cat2Africa 1.52 1.16 1.98 0.23
## mother_nationality_cat2Asia 1.14 0.85 1.53 0.07
## mother_nationality_cat2Europe 1.01 0.88 1.16 0.01
## mother_nationality_cat2Northern America 1.14 0.51 2.53 0.07
## mother_nationality_cat2Southern and Central America 0.87 0.56 1.34 0.08
k.check(m_SB_bam_covid_2)## k' edf k-index p-value
## s(GA_weeks) 9 7.35472 0.9382636 0.815
concurvity(m_SB_bam_2009_2)## para s(GA_weeks) s(mat_age)
## worst 0.9840637 0.010335562 0.060819741
## observed 0.9840637 0.005510772 0.005650732
## estimate 0.9840637 0.006763660 0.037289301
Why we chose this model: Maternal age had edf values very close to one: we put it as a linear term instead of smoothed variable. As for the previous model (including all timing 2007-2020), we kept time (birth_Y_M_num) and mean_ssep2 variables as linear terms.
Gestational age in days (continuous)
2007-2020
m_GA_bam_days <- bam(GA_days ~ s(BW) + s(mat_age) + s(birth_Y_M_num) + s(month, bs="cc") + s(mean_ssep2) + parity_cat + sex + Urban3 + mother_nationality_cat2 + Language, data=bevn_eco_in7)
summary(m_GA_bam_days)##
## Family: gaussian
## Link function: identity
##
## Formula:
## GA_days ~ s(BW) + s(mat_age) + s(birth_Y_M_num) + s(month, bs = "cc") +
## s(mean_ssep2) + parity_cat + sex + Urban3 + mother_nationality_cat2 +
## Language
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 276.07118 0.01685 16386.283 < 2e-16 ***
## parity_cat(1,2] -1.99314 0.01688 -118.092 < 2e-16 ***
## parity_cat(2,3] -2.50729 0.02584 -97.036 < 2e-16 ***
## parity_cat(3,20] -2.58971 0.04475 -57.868 < 2e-16 ***
## sex2 1.80642 0.01522 118.682 < 2e-16 ***
## Urban3 0.08410 0.01596 5.271 1.36e-07 ***
## mother_nationality_cat2Africa 0.60324 0.04574 13.188 < 2e-16 ***
## mother_nationality_cat2Asia -0.81997 0.04145 -19.782 < 2e-16 ***
## mother_nationality_cat2Europe -0.36475 0.01714 -21.277 < 2e-16 ***
## mother_nationality_cat2Northern America -0.31178 0.09845 -3.167 0.00154 **
## mother_nationality_cat2Southern and Central America -1.60422 0.05676 -28.265 < 2e-16 ***
## LanguageFrench 0.14384 0.01855 7.755 8.81e-15 ***
## LanguageItalian -0.18286 0.04161 -4.395 1.11e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(BW) 8.980 9.000 137150.04 <2e-16 ***
## s(mat_age) 7.987 8.600 375.14 <2e-16 ***
## s(birth_Y_M_num) 8.738 8.980 116.13 <2e-16 ***
## s(month) 7.132 8.000 11.55 <2e-16 ***
## s(mean_ssep2) 8.047 8.758 63.78 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.514 Deviance explained = 51.4%
## fREML = 4.17e+06 Scale est. = 66.929 n = 1184367
#plot(m_GA_bam_days, all.terms = TRUE,pages=1, shade = TRUE, shade.col = "lightblue")
plot(m_GA_bam_days, select=1, shade = TRUE, shade.col = "lightblue", shift = coef(m_GA_bam_days)[1]) #BW plot(m_GA_bam_days, select=2, ylim=c(265, 280),shade = TRUE, shade.col = "lightblue", shift = coef(m_GA_bam_days)[1]) #mat age plot(m_GA_bam_days, select=3, ylim=c(274, 278),shade = TRUE, shade.col = "lightblue", shift = coef(m_GA_bam_days)[1], xlab = "1 : 2007-01, 25: 2009-01, 50: 2011-02, 75: 2013-03, 100: 2015-04, 125: 2017-05, 150: 2019-06") #birth_Y_M_num plot(m_GA_bam_days, select=4, ylim=c(275, 277),shade = TRUE, shade.col = "lightblue", shift = coef(m_GA_bam_days)[1]) #month plot(m_GA_bam_days, select=5, ylim=c(272, 280),shade = TRUE, shade.col = "lightblue", shift = coef(m_GA_bam_days)[1]) #ssep gam.check(m_GA_bam_days)##
## Method: fREML Optimizer: perf newton
## full convergence after 14 iterations.
## Gradient range [-0.001144173,0.001139734]
## (score 4170001 & scale 66.92923).
## Hessian positive definite, eigenvalue range [2.067224,592175].
## Model rank = 57 / 57
##
## Basis dimension (k) checking results. Low p-value (k-index<1) may
## indicate that k is too low, especially if edf is close to k'.
##
## k' edf k-index p-value
## s(BW) 9.00 8.98 0.98 0.075 .
## s(mat_age) 9.00 7.99 1.02 0.855
## s(birth_Y_M_num) 9.00 8.74 1.01 0.850
## s(month) 8.00 7.13 0.98 0.060 .
## s(mean_ssep2) 9.00 8.05 0.99 0.250
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
concurvity(m_GA_bam_days, full=TRUE)## para s(BW) s(mat_age) s(birth_Y_M_num) s(month) s(mean_ssep2)
## worst 0.8009597 0.04838736 0.12017455 0.01923103 0.0153242365 0.17842652
## observed 0.8009597 0.01907332 0.06787388 0.01096697 0.0005763312 0.04757637
## estimate 0.8009597 0.03055512 0.07628677 0.01060936 0.0018068021 0.08868056
#estimate, 95%CI and d
beta <- coef(m_GA_bam_days)
Vb <- vcov(m_GA_bam_days, unconditional = TRUE)
se <- sqrt(diag(Vb))
n <- 1099328
d <- (abs(beta)/(sqrt(n)*se))
my.ci <- data.frame(cbind(beta, lci = beta-1.96*se, uci = beta + 1.96*se, d))
round(head(my.ci,13), 2)## beta lci uci d
## (Intercept) 276.07 276.04 276.10 15.63
## parity_cat(1,2] -1.99 -2.03 -1.96 0.11
## parity_cat(2,3] -2.51 -2.56 -2.46 0.09
## parity_cat(3,20] -2.59 -2.68 -2.50 0.06
## sex2 1.81 1.78 1.84 0.11
## Urban3 0.08 0.05 0.12 0.01
## mother_nationality_cat2Africa 0.60 0.51 0.69 0.01
## mother_nationality_cat2Asia -0.82 -0.90 -0.74 0.02
## mother_nationality_cat2Europe -0.36 -0.40 -0.33 0.02
## mother_nationality_cat2Northern America -0.31 -0.50 -0.12 0.00
## mother_nationality_cat2Southern and Central America -1.60 -1.72 -1.49 0.03
## LanguageFrench 0.14 0.11 0.18 0.01
## LanguageItalian -0.18 -0.26 -0.10 0.00
2008-2010
m_GA_bam_2009 <- bam(GA_days ~ s(BW) + s(mat_age) + s(birth_Y_M_num) + s(mean_ssep2) + parity_cat + sex + Urban3 + mother_nationality_cat2 + Language, data=bevn_eco_in7_2008_10)
summary(m_GA_bam_2009)##
## Family: gaussian
## Link function: identity
##
## Formula:
## GA_days ~ s(BW) + s(mat_age) + s(birth_Y_M_num) + s(mean_ssep2) +
## parity_cat + sex + Urban3 + mother_nationality_cat2 + Language
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 275.58327 0.03933 7006.080 < 2e-16 ***
## parity_cat(1,2] -1.86036 0.03990 -46.630 < 2e-16 ***
## parity_cat(2,3] -2.27644 0.06112 -37.247 < 2e-16 ***
## parity_cat(3,20] -2.34844 0.10411 -22.558 < 2e-16 ***
## sex2 1.81049 0.03585 50.507 < 2e-16 ***
## Urban3 0.16306 0.03747 4.352 1.35e-05 ***
## mother_nationality_cat2Africa -0.18009 0.12734 -1.414 0.15730
## mother_nationality_cat2Asia -0.87989 0.10417 -8.446 < 2e-16 ***
## mother_nationality_cat2Europe -0.29513 0.04100 -7.199 6.11e-13 ***
## mother_nationality_cat2Northern America -0.58773 0.22409 -2.623 0.00872 **
## mother_nationality_cat2Southern and Central America -1.51699 0.12662 -11.980 < 2e-16 ***
## LanguageFrench -0.08712 0.04401 -1.979 0.04778 *
## LanguageItalian -0.21050 0.09199 -2.288 0.02213 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(BW) 8.878 8.995 24548.39 <2e-16 ***
## s(mat_age) 7.169 7.932 66.92 <2e-16 ***
## s(birth_Y_M_num) 6.185 7.342 30.98 <2e-16 ***
## s(mean_ssep2) 4.787 5.925 20.43 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.505 Deviance explained = 50.5%
## fREML = 7.7609e+05 Scale est. = 68.827 n = 219541
#plot(m_GA_bam_2009, all.terms = TRUE,pages=1, shade = TRUE, shade.col = "lightblue")
plot(m_GA_bam_2009, select=1, shade = TRUE, shade.col = "lightblue", shift = coef(m_GA_bam_2009)[1]) #BW plot(m_GA_bam_2009, select=2, ylim=c(265, 280),shade = TRUE, shade.col = "lightblue", shift = coef(m_GA_bam_2009)[1]) #mat age plot(m_GA_bam_2009, select=3, ylim=c(274, 278),shade = TRUE, shade.col = "lightblue", shift = coef(m_GA_bam_2009)[1], xlab = "13 : 2008-01, 20: 2008-08, 30: 2009-06, 40: 2010-04, 50: 2011-02") #birth_Y_M_num plot(m_GA_bam_2009, select=4, ylim=c(272, 280),shade = TRUE, shade.col = "lightblue", shift = coef(m_GA_bam_2009)[1]) #ssep gam.check(m_GA_bam_2009)##
## Method: fREML Optimizer: perf newton
## full convergence after 12 iterations.
## Gradient range [-5.224197e-05,5.200233e-05]
## (score 776094 & scale 68.82725).
## Hessian positive definite, eigenvalue range [0.9693772,109762].
## Model rank = 49 / 49
##
## Basis dimension (k) checking results. Low p-value (k-index<1) may
## indicate that k is too low, especially if edf is close to k'.
##
## k' edf k-index p-value
## s(BW) 9.00 8.88 0.99 0.17
## s(mat_age) 9.00 7.17 0.99 0.31
## s(birth_Y_M_num) 9.00 6.19 0.99 0.29
## s(mean_ssep2) 9.00 4.79 1.00 0.36
concurvity(m_GA_bam_2009, full=TRUE)## para s(BW) s(mat_age) s(birth_Y_M_num) s(mean_ssep2)
## worst 0.7975678 0.04864524 0.1355458 0.005373763 0.18006695
## observed 0.7975678 0.01986037 0.1118456 0.004330478 0.07065924
## estimate 0.7975678 0.03090752 0.0876663 0.002898412 0.08334234
#estimate, 95%CI and d
beta <- coef(m_GA_bam_2009)
Vb <- vcov(m_GA_bam_2009, unconditional = TRUE)
n <- 219790
se <- sqrt(diag(Vb))
d <- (abs(beta)/(sqrt(n)*se))
my.ci <- data.frame(cbind(beta, lci = beta-1.96*se, uci = beta + 1.96*se, d))
round(head(my.ci,13), 2)## beta lci uci d
## (Intercept) 275.58 275.51 275.66 14.94
## parity_cat(1,2] -1.86 -1.94 -1.78 0.10
## parity_cat(2,3] -2.28 -2.40 -2.16 0.08
## parity_cat(3,20] -2.35 -2.55 -2.14 0.05
## sex2 1.81 1.74 1.88 0.11
## Urban3 0.16 0.09 0.24 0.01
## mother_nationality_cat2Africa -0.18 -0.43 0.07 0.00
## mother_nationality_cat2Asia -0.88 -1.08 -0.68 0.02
## mother_nationality_cat2Europe -0.30 -0.38 -0.21 0.02
## mother_nationality_cat2Northern America -0.59 -1.03 -0.15 0.01
## mother_nationality_cat2Southern and Central America -1.52 -1.77 -1.27 0.03
## LanguageFrench -0.09 -0.17 0.00 0.00
## LanguageItalian -0.21 -0.39 -0.03 0.00
2017-2020
m_GA_bam_covid <- bam(GA_days ~ s(BW) + s(mat_age) + s(birth_Y_M_num) + s(mean_ssep2) + parity_cat + sex + Urban3 + mother_nationality_cat2 + Language, data=bevn_eco_in7_2017_20)
summary(m_GA_bam_covid)##
## Family: gaussian
## Link function: identity
##
## Formula:
## GA_days ~ s(BW) + s(mat_age) + s(birth_Y_M_num) + s(mean_ssep2) +
## parity_cat + sex + Urban3 + mother_nationality_cat2 + Language
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 276.32022 0.02811 9830.277 <2e-16 ***
## parity_cat(1,2] -2.04091 0.02801 -72.853 <2e-16 ***
## parity_cat(2,3] -2.69574 0.04270 -63.134 <2e-16 ***
## parity_cat(3,20] -2.87841 0.07436 -38.706 <2e-16 ***
## sex2 1.79108 0.02530 70.797 <2e-16 ***
## Urban3 0.03747 0.02652 1.413 0.1576
## mother_nationality_cat2Africa 0.85518 0.07076 12.086 <2e-16 ***
## mother_nationality_cat2Asia -0.97197 0.06616 -14.691 <2e-16 ***
## mother_nationality_cat2Europe -0.45519 0.02828 -16.094 <2e-16 ***
## mother_nationality_cat2Northern America -0.43414 0.16946 -2.562 0.0104 *
## mother_nationality_cat2Southern and Central America -1.82619 0.10015 -18.235 <2e-16 ***
## LanguageFrench 0.46622 0.03054 15.267 <2e-16 ***
## LanguageItalian -0.07131 0.07379 -0.966 0.3339
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(BW) 8.967 9.000 48663.535 <2e-16 ***
## s(mat_age) 7.095 7.866 195.938 <2e-16 ***
## s(birth_Y_M_num) 3.245 4.030 3.178 0.0123 *
## s(mean_ssep2) 7.952 8.708 19.843 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.515 Deviance explained = 51.6%
## fREML = 1.4695e+06 Scale est. = 65.409 n = 418723
#plot(m_GA_bam_covid, all.terms = TRUE,pages=1, shade = TRUE, shade.col = "lightblue")
plot(m_GA_bam_covid, select=1, shade = TRUE, shade.col = "lightblue", shift = coef(m_GA_bam_covid)[1]) #BW plot(m_GA_bam_covid, select=2, ylim=c(265, 280),shade = TRUE, shade.col = "lightblue", shift = coef(m_GA_bam_covid)[1]) #mat age plot(m_GA_bam_covid, select=3, ylim=c(274, 278),shade = TRUE, shade.col = "lightblue", shift = coef(m_GA_bam_covid)[1], xlab = "120 : 2016-12, 130: 2017-10, 140: 2018-08, 150: 2019-06, 160: 2020-04") #birth_Y_M_num plot(m_GA_bam_covid, select=4, ylim=c(272, 280),shade = TRUE, shade.col = "lightblue", shift = coef(m_GA_bam_covid)[1]) #ssep gam.check(m_GA_bam_covid)##
## Method: fREML Optimizer: perf newton
## full convergence after 13 iterations.
## Gradient range [-5.899562e-05,5.814507e-05]
## (score 1469498 & scale 65.40897).
## Hessian positive definite, eigenvalue range [0.5475558,209353].
## Model rank = 49 / 49
##
## Basis dimension (k) checking results. Low p-value (k-index<1) may
## indicate that k is too low, especially if edf is close to k'.
##
## k' edf k-index p-value
## s(BW) 9.00 8.97 1.00 0.52
## s(mat_age) 9.00 7.10 1.02 0.88
## s(birth_Y_M_num) 9.00 3.25 0.98 0.14
## s(mean_ssep2) 9.00 7.95 0.98 0.09 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
concurvity(m_GA_bam_covid, full=TRUE)## para s(BW) s(mat_age) s(birth_Y_M_num) s(mean_ssep2)
## worst 0.8024145 0.04813213 0.10586645 0.002210568 0.17688497
## observed 0.8024145 0.01903303 0.03934653 0.001335394 0.02862882
## estimate 0.8024145 0.03109108 0.06703716 0.001774928 0.08597874
#estimate, 95%CI and d
beta <- coef(m_GA_bam_covid)
Vb <- vcov(m_GA_bam_covid, unconditional = TRUE)
n <- 332752
se <- sqrt(diag(Vb))
d <- (abs(beta)/(sqrt(n)*se))
my.ci <- data.frame(cbind(beta, lci = beta-1.96*se, uci = beta + 1.96*se, d))
round(head(my.ci,13), 2)## beta lci uci d
## (Intercept) 276.32 276.27 276.38 17.04
## parity_cat(1,2] -2.04 -2.10 -1.99 0.13
## parity_cat(2,3] -2.70 -2.78 -2.61 0.11
## parity_cat(3,20] -2.88 -3.02 -2.73 0.07
## sex2 1.79 1.74 1.84 0.12
## Urban3 0.04 -0.01 0.09 0.00
## mother_nationality_cat2Africa 0.86 0.72 0.99 0.02
## mother_nationality_cat2Asia -0.97 -1.10 -0.84 0.03
## mother_nationality_cat2Europe -0.46 -0.51 -0.40 0.03
## mother_nationality_cat2Northern America -0.43 -0.77 -0.10 0.00
## mother_nationality_cat2Southern and Central America -1.83 -2.02 -1.63 0.03
## LanguageFrench 0.47 0.41 0.53 0.03
## LanguageItalian -0.07 -0.22 0.07 0.00
Pre-term birth: yes/no
2007-2020
m_PTB_bam<-bam(PTB~s(BW)+ s(mat_age) + s(birth_Y_M_num) + s(month, bs="cc") +s(mean_ssep2) + parity_cat + Urban3 + Language + mother_nationality_cat2,family="binomial", data=bevn_eco_in7)
summary(m_PTB_bam)##
## Family: binomial
## Link function: logit
##
## Formula:
## PTB ~ s(BW) + s(mat_age) + s(birth_Y_M_num) + s(month, bs = "cc") +
## s(mean_ssep2) + parity_cat + Urban3 + Language + mother_nationality_cat2
##
## Parametric coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -4.27903 0.01490 -287.161 < 2e-16 ***
## parity_cat(1,2] 0.10054 0.01186 8.477 < 2e-16 ***
## parity_cat(2,3] 0.22417 0.01854 12.092 < 2e-16 ***
## parity_cat(3,20] 0.40123 0.03079 13.031 < 2e-16 ***
## Urban3 -0.04991 0.01103 -4.523 6.10e-06 ***
## LanguageFrench -0.15747 0.01266 -12.439 < 2e-16 ***
## LanguageItalian -0.19497 0.02760 -7.065 1.60e-12 ***
## mother_nationality_cat2Africa -0.13563 0.03447 -3.935 8.32e-05 ***
## mother_nationality_cat2Asia -0.11206 0.02828 -3.963 7.41e-05 ***
## mother_nationality_cat2Europe 0.05740 0.01198 4.790 1.67e-06 ***
## mother_nationality_cat2Northern America 0.13741 0.07064 1.945 0.0517 .
## mother_nationality_cat2Southern and Central America 0.21971 0.03753 5.854 4.80e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df Chi.sq p-value
## s(BW) 8.412 8.892 104691.03 < 2e-16 ***
## s(mat_age) 6.157 7.090 140.77 < 2e-16 ***
## s(birth_Y_M_num) 2.903 3.614 87.03 < 2e-16 ***
## s(month) 4.956 8.000 24.75 4.21e-05 ***
## s(mean_ssep2) 5.550 6.742 24.18 0.000864 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.374 Deviance explained = 42.5%
## fREML = 1.6804e+06 Scale est. = 1 n = 1184367
#adding intercept
#plot(m_PTB_bam, pages = 1, trans = plogis, shift = coef(m_PTB_bam)[1])
plot(m_PTB_bam, trans = plogis, select=1,,shift = coef(m_PTB_bam)[1], shade = TRUE, shade.col = "lightblue") #BW plot(m_PTB_bam, trans = plogis, select=2, ylim=c(0, 0.05), shift = coef(m_PTB_bam)[1], shade = TRUE, shade.col = "lightblue") #matage plot(m_PTB_bam, trans = plogis, select=3, ylim=c(0.01, 0.02), shift = coef(m_PTB_bam)[1], shade = TRUE, shade.col = "lightblue", xlab = "1 : 2007-01, 25: 2009-01, 50: 2011-02, 75: 2013-03, 100: 2015-04, 125: 2017-05, 150: 2019-06") #birth_Y_M_num plot(m_PTB_bam, trans = plogis, select=4, ylim=c(0.012, 0.016), shift = coef(m_PTB_bam)[1], shade = TRUE, shade.col = "lightblue") #month plot(m_PTB_bam, trans = plogis, select=5, ylim=c(0.001, 0.02),shift = coef(m_PTB_bam)[1], shade = TRUE, shade.col = "lightblue") #ssep k.check(m_PTB_bam) #some kindex values are a bit far from 1## k' edf k-index p-value
## s(BW) 9 8.412387 0.9612288 0.7300
## s(mat_age) 9 6.157264 0.9629682 0.7750
## s(birth_Y_M_num) 9 2.902858 0.9672440 0.8225
## s(month) 8 4.955678 0.9422351 0.1700
## s(mean_ssep2) 9 5.549624 0.9700962 0.8725
concurvity(m_PTB_bam)## para s(BW) s(mat_age) s(birth_Y_M_num) s(month) s(mean_ssep2)
## worst 0.7559504 0.02682886 0.12016821 0.01924103 0.015323271 0.17842068
## observed 0.7559504 0.02180735 0.09523380 0.01443010 0.001144207 0.07401603
## estimate 0.7559504 0.01782166 0.07629403 0.01060911 0.001803364 0.08866835
#to calculate the OR of PTB
OR <- exp(coef(m_PTB_bam))
beta <- coef(m_PTB_bam)
Vb <- vcov(m_PTB_bam, unconditional = TRUE)
se <- sqrt(diag(Vb))
lci <- exp(beta-1.96*se)
uci <- exp(beta+1.96*se)
d <- abs(beta)/1.81
my.ci <- data.frame(cbind(OR, lci, uci, d))
round(head(my.ci, 12), 2)## OR lci uci d
## (Intercept) 0.01 0.01 0.01 2.36
## parity_cat(1,2] 1.11 1.08 1.13 0.06
## parity_cat(2,3] 1.25 1.21 1.30 0.12
## parity_cat(3,20] 1.49 1.41 1.59 0.22
## Urban3 0.95 0.93 0.97 0.03
## LanguageFrench 0.85 0.83 0.88 0.09
## LanguageItalian 0.82 0.78 0.87 0.11
## mother_nationality_cat2Africa 0.87 0.82 0.93 0.07
## mother_nationality_cat2Asia 0.89 0.85 0.94 0.06
## mother_nationality_cat2Europe 1.06 1.03 1.08 0.03
## mother_nationality_cat2Northern America 1.15 1.00 1.32 0.08
## mother_nationality_cat2Southern and Central America 1.25 1.16 1.34 0.12
2008-2010
The variables ssep, birth_Y_M_num were set as non-smooth variables (model including them as smooth variables showed a linear relationship with PTB risk)
m_PTB_bam_2009_2 <-bam(PTB~s(BW)+ s(mat_age) + birth_Y_M_num + mean_ssep2 + parity_cat + Urban3 + Language + mother_nationality_cat2,family="binomial", data=bevn_eco_in7_2008_10)
summary(m_PTB_bam_2009_2)##
## Family: binomial
## Link function: logit
##
## Formula:
## PTB ~ s(BW) + s(mat_age) + birth_Y_M_num + mean_ssep2 + parity_cat +
## Urban3 + Language + mother_nationality_cat2
##
## Parametric coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -4.019551 0.097752 -41.120 < 2e-16 ***
## birth_Y_M_num -0.005654 0.001144 -4.942 7.73e-07 ***
## mean_ssep2 0.000990 0.001455 0.680 0.496325
## parity_cat(1,2] 0.041716 0.027017 1.544 0.122579
## parity_cat(2,3] 0.180976 0.042248 4.284 1.84e-05 ***
## parity_cat(3,20] 0.339967 0.068920 4.933 8.11e-07 ***
## Urban3 -0.061948 0.024545 -2.524 0.011607 *
## LanguageFrench -0.126149 0.028548 -4.419 9.92e-06 ***
## LanguageItalian -0.233104 0.058563 -3.980 6.88e-05 ***
## mother_nationality_cat2Africa -0.016142 0.088602 -0.182 0.855433
## mother_nationality_cat2Asia -0.075878 0.068436 -1.109 0.267542
## mother_nationality_cat2Europe 0.081420 0.027527 2.958 0.003098 **
## mother_nationality_cat2Northern America 0.330920 0.148260 2.232 0.025613 *
## mother_nationality_cat2Southern and Central America 0.273102 0.081409 3.355 0.000795 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df Chi.sq p-value
## s(BW) 6.863 7.719 19913.79 <2e-16 ***
## s(mat_age) 5.638 6.628 42.36 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.362 Deviance explained = 41.2%
## fREML = 3.1158e+05 Scale est. = 1 n = 219541
plot(m_PTB_bam_2009_2, trans = plogis, select=1, shade = TRUE, shade.col = "lightblue", shift = coef(m_PTB_bam_2009_2)[1]) #BW plot(m_PTB_bam_2009_2, trans = plogis, select=2, ylim=c(0, 0.1),shade = TRUE, shade.col = "lightblue", shift = coef(m_PTB_bam_2009_2)[1]) #mat age k.check(m_PTB_bam_2009_2)## k' edf k-index p-value
## s(BW) 9 6.862933 0.9861344 0.9950
## s(mat_age) 9 5.638419 0.9393041 0.1675
concurvity(m_PTB_bam_2009_2)## para s(BW) s(mat_age)
## worst 0.9848177 0.02728164 0.13440096
## observed 0.9848177 0.02229960 0.09798691
## estimate 0.9848177 0.01756890 0.08677490
#to calculate the OR of PTB
OR <- exp(coef(m_PTB_bam_2009_2))
beta <- coef(m_PTB_bam_2009_2)
Vb <- vcov(m_PTB_bam_2009_2, unconditional = TRUE)
se <- sqrt(diag(Vb))
lci <- exp(beta-1.96*se)
uci <- exp(beta+1.96*se)
d <- abs(beta)/1.81
my.ci <- data.frame(cbind(OR, lci, uci, d))
round(head(my.ci, 16), 2)## OR lci uci d
## (Intercept) 0.02 0.01 0.02 2.22
## birth_Y_M_num 0.99 0.99 1.00 0.00
## mean_ssep2 1.00 1.00 1.00 0.00
## parity_cat(1,2] 1.04 0.99 1.10 0.02
## parity_cat(2,3] 1.20 1.10 1.30 0.10
## parity_cat(3,20] 1.40 1.23 1.61 0.19
## Urban3 0.94 0.90 0.99 0.03
## LanguageFrench 0.88 0.83 0.93 0.07
## LanguageItalian 0.79 0.71 0.89 0.13
## mother_nationality_cat2Africa 0.98 0.83 1.17 0.01
## mother_nationality_cat2Asia 0.93 0.81 1.06 0.04
## mother_nationality_cat2Europe 1.08 1.03 1.14 0.04
## mother_nationality_cat2Northern America 1.39 1.04 1.86 0.18
## mother_nationality_cat2Southern and Central America 1.31 1.12 1.54 0.15
## s(BW).1 0.04 0.00 8.22 1.83
## s(BW).2 3.99 1.38 11.52 0.76
2017-2020
#birth_Y_M, month and mean ssep as linear variables:
m_PTB_bam_covid_2 <-bam(PTB~s(BW)+ s(mat_age) + birth_Y_M_num + mean_ssep2 + parity_cat + Urban3 + Language + mother_nationality_cat2,family="binomial", data=bevn_eco_in7_2017_20)
summary(m_PTB_bam_covid_2)##
## Family: binomial
## Link function: logit
##
## Formula:
## PTB ~ s(BW) + s(mat_age) + birth_Y_M_num + mean_ssep2 + parity_cat +
## Urban3 + Language + mother_nationality_cat2
##
## Parametric coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -4.3636720 0.1057968 -41.246 < 2e-16 ***
## birth_Y_M_num -0.0007553 0.0005170 -1.461 0.144043
## mean_ssep2 0.0005780 0.0011166 0.518 0.604712
## parity_cat(1,2] 0.1523351 0.0203774 7.476 7.68e-14 ***
## parity_cat(2,3] 0.2849282 0.0316713 8.996 < 2e-16 ***
## parity_cat(3,20] 0.4984275 0.0517281 9.636 < 2e-16 ***
## Urban3 -0.0525730 0.0186436 -2.820 0.004804 **
## LanguageFrench -0.2035325 0.0214928 -9.470 < 2e-16 ***
## LanguageItalian -0.1874793 0.0507821 -3.692 0.000223 ***
## mother_nationality_cat2Africa -0.2036098 0.0560310 -3.634 0.000279 ***
## mother_nationality_cat2Asia -0.1052253 0.0467177 -2.252 0.024299 *
## mother_nationality_cat2Europe 0.0727654 0.0203680 3.573 0.000354 ***
## mother_nationality_cat2Northern America 0.0898118 0.1294340 0.694 0.487757
## mother_nationality_cat2Southern and Central America 0.2236519 0.0671577 3.330 0.000868 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df Chi.sq p-value
## s(BW) 8.275 8.836 35693.64 < 2e-16 ***
## s(mat_age) 4.149 5.096 31.82 6.4e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.377 Deviance explained = 43.3%
## fREML = 5.9575e+05 Scale est. = 1 n = 418723
plot(m_PTB_bam_covid_2, trans = plogis, select=1, shade = TRUE, shade.col = "lightblue", shift = coef(m_PTB_bam_covid_2)[1]) #BW plot(m_PTB_bam_covid_2, trans = plogis, select=2, ylim=c(0, 0.04),shade = TRUE, shade.col = "lightblue", shift = coef(m_PTB_bam_covid_2)[1]) #mat age k.check(m_PTB_bam_covid_2)## k' edf k-index p-value
## s(BW) 9 8.274686 0.9605274 0.645
## s(mat_age) 9 4.149347 0.9688061 0.860
concurvity(m_PTB_bam_covid_2)## para s(BW) s(mat_age)
## worst 0.9925765 0.02680212 0.10503839
## observed 0.9925765 0.02196203 0.05798255
## estimate 0.9925765 0.01817067 0.06641411
#to calculate the OR of PTB
OR <- exp(coef(m_PTB_bam_covid_2))
beta <- coef(m_PTB_bam_covid_2)
Vb <- vcov(m_PTB_bam_covid_2, unconditional = TRUE)
se <- sqrt(diag(Vb))
lci <- exp(beta-1.96*se)
uci <- exp(beta+1.96*se)
d <- abs(beta)/1.81
my.ci <- data.frame(cbind(OR, lci, uci, d))
round(head(my.ci, 14), 2)## OR lci uci d
## (Intercept) 0.01 0.01 0.02 2.41
## birth_Y_M_num 1.00 1.00 1.00 0.00
## mean_ssep2 1.00 1.00 1.00 0.00
## parity_cat(1,2] 1.16 1.12 1.21 0.08
## parity_cat(2,3] 1.33 1.25 1.41 0.16
## parity_cat(3,20] 1.65 1.49 1.82 0.28
## Urban3 0.95 0.91 0.98 0.03
## LanguageFrench 0.82 0.78 0.85 0.11
## LanguageItalian 0.83 0.75 0.92 0.10
## mother_nationality_cat2Africa 0.82 0.73 0.91 0.11
## mother_nationality_cat2Asia 0.90 0.82 0.99 0.06
## mother_nationality_cat2Europe 1.08 1.03 1.12 0.04
## mother_nationality_cat2Northern America 1.09 0.85 1.41 0.05
## mother_nationality_cat2Southern and Central America 1.25 1.10 1.43 0.12
Low birth weight
2007-2020
Month variable was set as a linear term (when included as a smooth term, edf=1.3)
m_LBW_bam<-bam(LBW~s(GA_weeks)+ s(mat_age) + s(birth_Y_M_num) + month +s(mean_ssep2) + parity_cat + Urban3 + Language + mother_nationality_cat2,family="binomial", data=bevn_eco_in7)
summary(m_LBW_bam)##
## Family: binomial
## Link function: logit
##
## Formula:
## LBW ~ s(GA_weeks) + s(mat_age) + s(birth_Y_M_num) + month + s(mean_ssep2) +
## parity_cat + Urban3 + Language + mother_nationality_cat2
##
## Parametric coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -4.170735 0.019255 -216.605 < 2e-16 ***
## month 0.001397 0.001706 0.819 0.413
## parity_cat(1,2] -0.658150 0.013486 -48.802 < 2e-16 ***
## parity_cat(2,3] -0.793879 0.021835 -36.358 < 2e-16 ***
## parity_cat(3,20] -0.810701 0.037148 -21.824 < 2e-16 ***
## Urban3 0.013761 0.012128 1.135 0.256
## LanguageFrench 0.291054 0.013537 21.500 < 2e-16 ***
## LanguageItalian 0.174153 0.029461 5.911 3.39e-09 ***
## mother_nationality_cat2Africa 0.012109 0.036198 0.335 0.738
## mother_nationality_cat2Asia -0.003507 0.030386 -0.115 0.908
## mother_nationality_cat2Europe -0.155567 0.013448 -11.568 < 2e-16 ***
## mother_nationality_cat2Northern America -0.488842 0.088061 -5.551 2.84e-08 ***
## mother_nationality_cat2Southern and Central America -0.365458 0.044285 -8.252 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df Chi.sq p-value
## s(GA_weeks) 8.213 8.721 95869.72 <2e-16 ***
## s(mat_age) 5.344 6.336 133.99 <2e-16 ***
## s(birth_Y_M_num) 3.573 4.430 54.74 <2e-16 ***
## s(mean_ssep2) 1.919 2.459 141.03 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.409 Deviance explained = 45.3%
## fREML = 1.6808e+06 Scale est. = 1 n = 1184367
plot(m_LBW_bam, trans = plogis, select=1, ylim=c(0, 1), shift = coef(m_LBW_bam)[1], shade = TRUE, shade.col = "lightblue") #GA_weeks plot(m_LBW_bam, trans = plogis, select=2, ylim=c(.005, .03), shift = coef(m_LBW_bam)[1], shade = TRUE, shade.col = "lightblue") #mat_age plot(m_LBW_bam, trans = plogis, select=3, ylim=c(.01, .02), shift = coef(m_LBW_bam)[1], shade = TRUE, shade.col = "lightblue", xlab = "1 : 2007-01, 25: 2009-01, 50: 2011-02, 75: 2013-03, 100: 2015-04, 125: 2017-05, 150: 2019-06") #birth_Y_M_num plot(m_LBW_bam, trans = plogis, select=4, ylim=c(.005, .025), shift = coef(m_LBW_bam)[1], shade = TRUE, shade.col = "lightblue") #ssep k.check(m_LBW_bam)## k' edf k-index p-value
## s(GA_weeks) 9 8.213276 0.9042977 0.0400
## s(mat_age) 9 5.344002 0.8971698 0.0275
## s(birth_Y_M_num) 9 3.573213 0.9363513 0.7175
## s(mean_ssep2) 9 1.918593 0.9462893 0.9250
concurvity(m_LBW_bam)## para s(GA_weeks) s(mat_age) s(birth_Y_M_num) s(mean_ssep2)
## worst 0.8737817 0.016822302 0.12195038 0.02817936 0.17827913
## observed 0.8737817 0.008094305 0.05125256 0.01251626 0.12983570
## estimate 0.8737817 0.009212994 0.07807477 0.01193099 0.08866264
#to calculate the OR of LBW for non smooth terms
OR <- exp(coef(m_LBW_bam))
beta <- coef(m_LBW_bam)
Vb <- vcov(m_LBW_bam, unconditional = TRUE)
se <- sqrt(diag(Vb))
lci <- exp(beta-1.96*se)
uci <- exp(beta+1.96*se)
d <- abs(beta)/1.81
my.ci <- data.frame(cbind(OR, lci, uci, d))
round(head(my.ci, 14), 2)## OR lci uci d
## (Intercept) 0.02 0.01 2.000000e-02 2.30
## month 1.00 1.00 1.000000e+00 0.00
## parity_cat(1,2] 0.52 0.50 5.300000e-01 0.36
## parity_cat(2,3] 0.45 0.43 4.700000e-01 0.44
## parity_cat(3,20] 0.44 0.41 4.800000e-01 0.45
## Urban3 1.01 0.99 1.040000e+00 0.01
## LanguageFrench 1.34 1.30 1.370000e+00 0.16
## LanguageItalian 1.19 1.12 1.260000e+00 0.10
## mother_nationality_cat2Africa 1.01 0.94 1.090000e+00 0.01
## mother_nationality_cat2Asia 1.00 0.94 1.060000e+00 0.00
## mother_nationality_cat2Europe 0.86 0.83 8.800000e-01 0.09
## mother_nationality_cat2Northern America 0.61 0.52 7.300000e-01 0.27
## mother_nationality_cat2Southern and Central America 0.69 0.64 7.600000e-01 0.20
## s(GA_weeks).1 7449.76 0.00 2.703068e+10 4.93
2008-2010
As the variables birth_Y_M_num and month are too correlated : I did the model without month variable. Also, birth_Y_M_num and mean_ssep2 were set as non smooth variables (edf ~1 fit hey were included as smooth variables)
m_LBW_bam_2009_2 <-bam(LBW~s(GA_weeks)+ s(mat_age) + birth_Y_M_num + mean_ssep2 + parity_cat + Urban3 + Language + mother_nationality_cat2,family="binomial", data=bevn_eco_in7_2008_10)
summary(m_LBW_bam_2009_2)##
## Family: binomial
## Link function: logit
##
## Formula:
## LBW ~ s(GA_weeks) + s(mat_age) + birth_Y_M_num + mean_ssep2 +
## parity_cat + Urban3 + Language + mother_nationality_cat2
##
## Parametric coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.768470 0.109730 -34.343 < 2e-16 ***
## birth_Y_M_num 0.007536 0.001295 5.818 5.97e-09 ***
## mean_ssep2 -0.009651 0.001648 -5.857 4.70e-09 ***
## parity_cat(1,2] -0.630849 0.031030 -20.330 < 2e-16 ***
## parity_cat(2,3] -0.734615 0.050486 -14.551 < 2e-16 ***
## parity_cat(3,20] -0.783224 0.085543 -9.156 < 2e-16 ***
## Urban3 0.052348 0.027710 1.889 0.058878 .
## LanguageFrench 0.247259 0.031350 7.887 3.10e-15 ***
## LanguageItalian 0.297147 0.061094 4.864 1.15e-06 ***
## mother_nationality_cat2Africa -0.074537 0.095417 -0.781 0.434702
## mother_nationality_cat2Asia -0.024923 0.075955 -0.328 0.742818
## mother_nationality_cat2Europe -0.099973 0.031034 -3.221 0.001276 **
## mother_nationality_cat2Northern America -0.725608 0.208064 -3.487 0.000488 ***
## mother_nationality_cat2Southern and Central America -0.423301 0.099594 -4.250 2.14e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df Chi.sq p-value
## s(GA_weeks) 6.997 7.715 18388.85 < 2e-16 ***
## s(mat_age) 3.824 4.737 21.61 0.000488 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.401 Deviance explained = 44.1%
## fREML = 3.1294e+05 Scale est. = 1 n = 219541
plot(m_LBW_bam_2009_2, trans = plogis, select=1, ylim=c(0, 1), shift = coef(m_LBW_bam_2009_2)[1], shade = TRUE, shade.col = "lightblue") #GA_weeks plot(m_LBW_bam_2009_2, trans = plogis, select=2, ylim=c(.005, .045), shift = coef(m_LBW_bam_2009_2)[1], shade = TRUE, shade.col = "lightblue") #mat_age k.check(m_LBW_bam_2009_2)## k' edf k-index p-value
## s(GA_weeks) 9 6.996614 0.9538033 0.7825
## s(mat_age) 9 3.823819 0.9136238 0.0125
concurvity(m_LBW_bam_2009_2)## para s(GA_weeks) s(mat_age)
## worst 0.9848161 0.015199687 0.13684440
## observed 0.9848161 0.007602613 0.03931183
## estimate 0.9848161 0.008577179 0.08857085
#to calculate the OR of LBW for non smooth terms
OR <- exp(coef(m_LBW_bam_2009_2))
beta <- coef(m_LBW_bam_2009_2)
Vb <- vcov(m_LBW_bam_2009_2, unconditional = TRUE)
se <- sqrt(diag(Vb))
lci <- exp(beta-1.96*se)
uci <- exp(beta+1.96*se)
d <- abs(beta)/1.81
my.ci <- data.frame(cbind(OR, lci, uci, d))
round(head(my.ci, 15), 2)## OR lci uci d
## (Intercept) 0.02 0.02 0.03 2.08
## birth_Y_M_num 1.01 1.01 1.01 0.00
## mean_ssep2 0.99 0.99 0.99 0.01
## parity_cat(1,2] 0.53 0.50 0.57 0.35
## parity_cat(2,3] 0.48 0.43 0.53 0.41
## parity_cat(3,20] 0.46 0.39 0.54 0.43
## Urban3 1.05 1.00 1.11 0.03
## LanguageFrench 1.28 1.20 1.36 0.14
## LanguageItalian 1.35 1.19 1.52 0.16
## mother_nationality_cat2Africa 0.93 0.77 1.12 0.04
## mother_nationality_cat2Asia 0.98 0.84 1.13 0.01
## mother_nationality_cat2Europe 0.90 0.85 0.96 0.06
## mother_nationality_cat2Northern America 0.48 0.32 0.73 0.40
## mother_nationality_cat2Southern and Central America 0.65 0.54 0.80 0.23
## s(GA_weeks).1 5.08 0.00 38388095.15 0.90
2017-2020
Month variable was also not included in the model (correlation between birth_Y_M and month variables), and we set k=3 for s(birth_Y_M_num) because its almost linear (edf=1.99). Mean ssep set as linear term because almost linear (edf=1.47).
m_LBW_bam_covid_2 <-bam(LBW~s(GA_weeks)+ s(mat_age) + s(birth_Y_M_num) + mean_ssep2 + parity_cat + Urban3 + Language + mother_nationality_cat2,family="binomial", data=bevn_eco_in7_2017_20)
summary(m_LBW_bam_covid_2)##
## Family: binomial
## Link function: logit
##
## Formula:
## LBW ~ s(GA_weeks) + s(mat_age) + s(birth_Y_M_num) + mean_ssep2 +
## parity_cat + Urban3 + Language + mother_nationality_cat2
##
## Parametric coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.746543 0.076802 -48.782 < 2e-16 ***
## mean_ssep2 -0.008804 0.001228 -7.168 7.63e-13 ***
## parity_cat(1,2] -0.686799 0.022956 -29.918 < 2e-16 ***
## parity_cat(2,3] -0.852257 0.037059 -22.997 < 2e-16 ***
## parity_cat(3,20] -0.815425 0.061251 -13.313 < 2e-16 ***
## Urban3 0.007540 0.020547 0.367 0.713641
## LanguageFrench 0.313152 0.022859 13.700 < 2e-16 ***
## LanguageItalian 0.154051 0.054069 2.849 0.004383 **
## mother_nationality_cat2Africa 0.089957 0.057261 1.571 0.116178
## mother_nationality_cat2Asia -0.075480 0.050316 -1.500 0.133584
## mother_nationality_cat2Europe -0.194355 0.022792 -8.527 < 2e-16 ***
## mother_nationality_cat2Northern America -0.507320 0.154150 -3.291 0.000998 ***
## mother_nationality_cat2Southern and Central America -0.285689 0.076720 -3.724 0.000196 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df Chi.sq p-value
## s(GA_weeks) 7.448 8.103 32914.874 <2e-16 ***
## s(mat_age) 4.669 5.654 73.839 <2e-16 ***
## s(birth_Y_M_num) 2.872 3.572 3.982 0.489
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.41 Deviance explained = 46.1%
## fREML = 5.8924e+05 Scale est. = 1 n = 418723
plot(m_LBW_bam_covid_2, trans = plogis, select=1, ylim=c(0, 1), shift = coef(m_LBW_bam_covid_2)[1], shade = TRUE, shade.col = "lightblue") #GA_weeks plot(m_LBW_bam_covid_2, trans = plogis, select=2, ylim=c(.005, .06), shift = coef(m_LBW_bam_covid_2)[1], shade = TRUE, shade.col = "lightblue") #mat_age plot(m_LBW_bam_covid_2, trans = plogis, select=3, ylim=c(.015, .025), shift = coef(m_LBW_bam_covid_2)[1], shade = TRUE, shade.col = "lightblue", xlab = "120 : 2016-12, 130: 2017-10, 140: 2018-08, 150: 2019-06, 160: 2020-04") #birth_Y_M_num k.check(m_LBW_bam_covid_2)## k' edf k-index p-value
## s(GA_weeks) 9 7.447708 0.9365161 0.5050
## s(mat_age) 9 4.668823 0.9111971 0.0325
## s(birth_Y_M_num) 9 2.872041 0.9125434 0.0650
concurvity(m_LBW_bam_covid_2)## para s(GA_weeks) s(mat_age) s(birth_Y_M_num)
## worst 0.9828652 0.018103817 0.10646952 0.0021585461
## observed 0.9828652 0.009262249 0.05754737 0.0004403409
## estimate 0.9828652 0.010347794 0.06807860 0.0017328347
#to calculate the OR of LBW for non smooth terms
OR <- exp(coef(m_LBW_bam_covid_2))
beta <- coef(m_LBW_bam_covid_2)
Vb <- vcov(m_LBW_bam_covid_2, unconditional = TRUE)
se <- sqrt(diag(Vb))
lci <- exp(beta-1.96*se)
uci <- exp(beta+1.96*se)
d <- abs(beta)/1.81
my.ci <- data.frame(cbind(OR, lci, uci, d))
round(head(my.ci, 14), 2)## OR lci uci d
## (Intercept) 0.02 0.02 0.03 2.07
## mean_ssep2 0.99 0.99 0.99 0.00
## parity_cat(1,2] 0.50 0.48 0.53 0.38
## parity_cat(2,3] 0.43 0.40 0.46 0.47
## parity_cat(3,20] 0.44 0.39 0.50 0.45
## Urban3 1.01 0.97 1.05 0.00
## LanguageFrench 1.37 1.31 1.43 0.17
## LanguageItalian 1.17 1.05 1.30 0.09
## mother_nationality_cat2Africa 1.09 0.98 1.22 0.05
## mother_nationality_cat2Asia 0.93 0.84 1.02 0.04
## mother_nationality_cat2Europe 0.82 0.79 0.86 0.11
## mother_nationality_cat2Northern America 0.60 0.45 0.81 0.28
## mother_nationality_cat2Southern and Central America 0.75 0.65 0.87 0.16
## s(GA_weeks).1 0.00 0.00 6992.42 4.53
LBW vs “Normal” birthweight
2007-2020
Month was put as a linear variable.
m_LBW_normal_bam_2 <-bam(LBW_norm~s(GA_weeks)+ s(mat_age) + s(birth_Y_M_num) +s(mean_ssep2)+ month + parity_cat + Urban3 + Language + mother_nationality_cat2,family="binomial", data=bevn_eco_in7)
summary(m_LBW_normal_bam_2)##
## Family: binomial
## Link function: logit
##
## Formula:
## LBW_norm ~ s(GA_weeks) + s(mat_age) + s(birth_Y_M_num) + s(mean_ssep2) +
## month + parity_cat + Urban3 + Language + mother_nationality_cat2
##
## Parametric coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -4.031800 0.018808 -214.362 < 2e-16 ***
## month 0.001412 0.001708 0.827 0.408
## parity_cat(1,2] -0.641257 0.013502 -47.492 < 2e-16 ***
## parity_cat(2,3] -0.768985 0.021889 -35.131 < 2e-16 ***
## parity_cat(3,20] -0.781121 0.037284 -20.950 < 2e-16 ***
## Urban3 0.014092 0.012148 1.160 0.246
## LanguageFrench 0.286932 0.013551 21.174 < 2e-16 ***
## LanguageItalian 0.167455 0.029467 5.683 1.33e-08 ***
## mother_nationality_cat2Africa 0.022309 0.036294 0.615 0.539
## mother_nationality_cat2Asia -0.002687 0.030407 -0.088 0.930
## mother_nationality_cat2Europe -0.149774 0.013463 -11.125 < 2e-16 ***
## mother_nationality_cat2Northern America -0.483764 0.088194 -5.485 4.13e-08 ***
## mother_nationality_cat2Southern and Central America -0.359488 0.044349 -8.106 5.24e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df Chi.sq p-value
## s(GA_weeks) 8.168 8.685 93074.42 <2e-16 ***
## s(mat_age) 5.329 6.328 135.57 <2e-16 ***
## s(birth_Y_M_num) 3.589 4.450 53.91 <2e-16 ***
## s(mean_ssep2) 2.038 2.617 142.52 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.408 Deviance explained = 44.6%
## fREML = 1.5452e+06 Scale est. = 1 n = 1089985
plot(m_LBW_normal_bam_2, trans = plogis, select=1, ylim=c(0, 1), shift = coef(m_LBW_normal_bam_2)[1], shade = TRUE, shade.col = "lightblue") #GA_weeks plot(m_LBW_normal_bam_2, trans = plogis, select=2, ylim=c(.005, .03), shift = coef(m_LBW_normal_bam_2)[1], shade = TRUE, shade.col = "lightblue") #mat_age plot(m_LBW_normal_bam_2, trans = plogis, select=3, ylim=c(.005, .025), shift = coef(m_LBW_normal_bam_2)[1], shade = TRUE, shade.col = "lightblue", xlab = "1 : 2007-01, 25: 2009-01, 50: 2011-02, 75: 2013-03, 100: 2015-04, 125: 2017-05, 150: 2019-06") #birth_Y_M_num plot(m_LBW_normal_bam_2, trans = plogis, select=4, ylim=c(.005, .030), shift = coef(m_LBW_normal_bam_2)[1], shade = TRUE, shade.col = "lightblue") #ssep k.check(m_LBW_normal_bam_2)## k' edf k-index p-value
## s(GA_weeks) 9 8.167501 0.9100811 0.0625
## s(mat_age) 9 5.328882 0.9463479 0.8775
## s(birth_Y_M_num) 9 3.589290 0.9666339 0.9950
## s(mean_ssep2) 9 2.038040 0.9188551 0.1725
concurvity(m_LBW_normal_bam_2)## para s(GA_weeks) s(mat_age) s(birth_Y_M_num) s(mean_ssep2)
## worst 0.8731229 0.018789368 0.12052533 0.02813823 0.17913474
## observed 0.8731229 0.007916049 0.05050726 0.01292456 0.12898005
## estimate 0.8731229 0.009507133 0.07697922 0.01210150 0.08110965
#to calculate the OR of LBW vs normal BW
OR <- exp(coef(m_LBW_normal_bam_2))
beta <- coef(m_LBW_normal_bam_2)
Vb <- vcov(m_LBW_normal_bam_2, unconditional = TRUE)
se <- sqrt(diag(Vb))
lci <- exp(beta-1.96*se)
uci <- exp(beta+1.96*se)
d <- abs(beta)/1.81
my.ci <- data.frame(cbind(OR, lci, uci, d))
round(head(my.ci, 14), 2)## OR lci uci d
## (Intercept) 0.02 0.02 2.000000e-02 2.23
## month 1.00 1.00 1.000000e+00 0.00
## parity_cat(1,2] 0.53 0.51 5.400000e-01 0.35
## parity_cat(2,3] 0.46 0.44 4.800000e-01 0.42
## parity_cat(3,20] 0.46 0.43 4.900000e-01 0.43
## Urban3 1.01 0.99 1.040000e+00 0.01
## LanguageFrench 1.33 1.30 1.370000e+00 0.16
## LanguageItalian 1.18 1.12 1.250000e+00 0.09
## mother_nationality_cat2Africa 1.02 0.95 1.100000e+00 0.01
## mother_nationality_cat2Asia 1.00 0.94 1.060000e+00 0.00
## mother_nationality_cat2Europe 0.86 0.84 8.800000e-01 0.08
## mother_nationality_cat2Northern America 0.62 0.52 7.300000e-01 0.27
## mother_nationality_cat2Southern and Central America 0.70 0.64 7.600000e-01 0.20
## s(GA_weeks).1 3264.04 0.00 1.662361e+10 4.47
2008-2010
As always, month wasnt kept in the model (correlation between month and birth_Y_M_num variables). The variables birth_Y_M_num and ssep were set as linear terms.
m_LBW_normal_bam_2009_2 <-bam(LBW_norm~s(GA_weeks)+ s(mat_age) + birth_Y_M_num + mean_ssep2 + parity_cat +Urban3 + Language + mother_nationality_cat2,family="binomial", data=bevn_eco_in7_2008_10)
summary(m_LBW_normal_bam_2009_2)##
## Family: binomial
## Link function: logit
##
## Formula:
## LBW_norm ~ s(GA_weeks) + s(mat_age) + birth_Y_M_num + mean_ssep2 +
## parity_cat + Urban3 + Language + mother_nationality_cat2
##
## Parametric coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.668732 0.109567 -33.484 < 2e-16 ***
## birth_Y_M_num 0.007551 0.001297 5.823 5.79e-09 ***
## mean_ssep2 -0.009731 0.001649 -5.902 3.60e-09 ***
## parity_cat(1,2] -0.613558 0.031069 -19.748 < 2e-16 ***
## parity_cat(2,3] -0.705117 0.050613 -13.932 < 2e-16 ***
## parity_cat(3,20] -0.744827 0.085828 -8.678 < 2e-16 ***
## Urban3 0.054712 0.027728 1.973 0.04848 *
## LanguageFrench 0.240776 0.031369 7.676 1.65e-14 ***
## LanguageItalian 0.295217 0.061132 4.829 1.37e-06 ***
## mother_nationality_cat2Africa -0.059472 0.095738 -0.621 0.53447
## mother_nationality_cat2Asia -0.024337 0.076024 -0.320 0.74887
## mother_nationality_cat2Europe -0.092732 0.031069 -2.985 0.00284 **
## mother_nationality_cat2Northern America -0.725525 0.208431 -3.481 0.00050 ***
## mother_nationality_cat2Southern and Central America -0.414539 0.099769 -4.155 3.25e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df Chi.sq p-value
## s(GA_weeks) 6.989 7.700 17816.26 < 2e-16 ***
## s(mat_age) 3.834 4.748 21.41 0.000533 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.4 Deviance explained = 43.3%
## fREML = 2.8612e+05 Scale est. = 1 n = 201741
plot(m_LBW_normal_bam_2009_2, trans = plogis, select=1, ylim=c(0, 1), shift = coef(m_LBW_normal_bam_2009_2)[1], shade = TRUE, shade.col = "lightblue") #GA_weeks plot(m_LBW_normal_bam_2009_2, trans = plogis, select=2, ylim=c(.005, .05), shift = coef(m_LBW_normal_bam_2009_2)[1], shade = TRUE, shade.col = "lightblue") #mat_age k.check(m_LBW_normal_bam_2009_2)## k' edf k-index p-value
## s(GA_weeks) 9 6.989480 0.9334763 0.0875
## s(mat_age) 9 3.833701 0.9648590 0.8550
concurvity(m_LBW_normal_bam_2009_2)## para s(GA_weeks) s(mat_age)
## worst 0.9847877 0.016606006 0.13500520
## observed 0.9847877 0.007057865 0.04016032
## estimate 0.9847877 0.008374393 0.08809935
#to calculate the OR of LBW vs normal BW
OR <- exp(coef(m_LBW_normal_bam_2009_2))
beta <- coef(m_LBW_normal_bam_2009_2)
Vb <- vcov(m_LBW_normal_bam_2009_2, unconditional = TRUE)
se <- sqrt(diag(Vb))
lci <- exp(beta-1.96*se)
uci <- exp(beta+1.96*se)
d <- abs(beta)/1.81
my.ci <- data.frame(cbind(OR, lci, uci, d))
round(head(my.ci, 15), 2)## OR lci uci d
## (Intercept) 0.03 0.02 0.03 2.03
## birth_Y_M_num 1.01 1.01 1.01 0.00
## mean_ssep2 0.99 0.99 0.99 0.01
## parity_cat(1,2] 0.54 0.51 0.58 0.34
## parity_cat(2,3] 0.49 0.45 0.55 0.39
## parity_cat(3,20] 0.47 0.40 0.56 0.41
## Urban3 1.06 1.00 1.12 0.03
## LanguageFrench 1.27 1.20 1.35 0.13
## LanguageItalian 1.34 1.19 1.51 0.16
## mother_nationality_cat2Africa 0.94 0.78 1.14 0.03
## mother_nationality_cat2Asia 0.98 0.84 1.13 0.01
## mother_nationality_cat2Europe 0.91 0.86 0.97 0.05
## mother_nationality_cat2Northern America 0.48 0.32 0.73 0.40
## mother_nationality_cat2Southern and Central America 0.66 0.54 0.80 0.23
## s(GA_weeks).1 6.45 0.00 29194269.55 1.03
2017-2020
Ssep variable was set as linear variables (when included as smooth variable, edf= 1.62)
m_LBW_normal_bam_covid_2 <-bam(LBW_norm~s(GA_weeks)+ s(mat_age) + s(birth_Y_M_num) + mean_ssep2 + parity_cat + Urban3 + Language + mother_nationality_cat2,family="binomial", data=bevn_eco_in7_2017_20)
summary(m_LBW_normal_bam_covid_2)##
## Family: binomial
## Link function: logit
##
## Formula:
## LBW_norm ~ s(GA_weeks) + s(mat_age) + s(birth_Y_M_num) + mean_ssep2 +
## parity_cat + Urban3 + Language + mother_nationality_cat2
##
## Parametric coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.640834 0.076662 -47.492 < 2e-16 ***
## mean_ssep2 -0.008887 0.001229 -7.234 4.7e-13 ***
## parity_cat(1,2] -0.669058 0.022985 -29.109 < 2e-16 ***
## parity_cat(2,3] -0.827979 0.037154 -22.285 < 2e-16 ***
## parity_cat(3,20] -0.790600 0.061496 -12.856 < 2e-16 ***
## Urban3 0.007185 0.020572 0.349 0.726900
## LanguageFrench 0.310345 0.022870 13.570 < 2e-16 ***
## LanguageItalian 0.143493 0.054060 2.654 0.007946 **
## mother_nationality_cat2Africa 0.098204 0.057420 1.710 0.087214 .
## mother_nationality_cat2Asia -0.075501 0.050351 -1.499 0.133746
## mother_nationality_cat2Europe -0.189982 0.022818 -8.326 < 2e-16 ***
## mother_nationality_cat2Northern America -0.497354 0.154524 -3.219 0.001288 **
## mother_nationality_cat2Southern and Central America -0.280476 0.076831 -3.651 0.000262 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df Chi.sq p-value
## s(GA_weeks) 7.403 8.063 31977.585 <2e-16 ***
## s(mat_age) 4.656 5.641 73.474 <2e-16 ***
## s(birth_Y_M_num) 1.008 1.015 0.171 0.688
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.409 Deviance explained = 45.4%
## fREML = 5.4369e+05 Scale est. = 1 n = 385482
plot(m_LBW_normal_bam_covid_2, trans = plogis, select=1, ylim=c(0, 1), shift = coef(m_LBW_normal_bam_covid_2)[1], shade = TRUE, shade.col = "lightblue") #GA_weeks plot(m_LBW_normal_bam_covid_2, trans = plogis, select=2, ylim=c(.005, .07), shift = coef(m_LBW_normal_bam_covid_2)[1], shade = TRUE, shade.col = "lightblue") #mat_age plot(m_LBW_normal_bam_covid_2, trans = plogis, select=3, ylim=c(.020, .028), shift = coef(m_LBW_normal_bam_covid_2)[1], shade = TRUE, shade.col = "lightblue", xlab = "120 : 2016-12, 130: 2017-10, 140: 2018-08, 150: 2019-06, 160: 2020-04") #birth_Y_M_num k.check(m_LBW_normal_bam_covid_2)## k' edf k-index p-value
## s(GA_weeks) 9 7.403145 0.9180168 0.0000
## s(mat_age) 9 4.656068 0.9808477 0.9125
## s(birth_Y_M_num) 9 1.007556 0.9743953 0.8000
concurvity(m_LBW_normal_bam_covid_2)## para s(GA_weeks) s(mat_age) s(birth_Y_M_num)
## worst 0.9827434 0.021022328 0.10523619 0.002176946
## observed 0.9827434 0.009317885 0.05636597 0.002144167
## estimate 0.9827434 0.010863461 0.06709538 0.001694894
#to calculate the OR of LBW vs normal BW
OR <- exp(coef(m_LBW_normal_bam_covid_2))
beta <- coef(m_LBW_normal_bam_covid_2)
Vb <- vcov(m_LBW_normal_bam_covid_2, unconditional = TRUE)
se <- sqrt(diag(Vb))
lci <- exp(beta-1.96*se)
uci <- exp(beta+1.96*se)
d <- abs(beta)/1.81
my.ci <- data.frame(cbind(OR, lci, uci, d))
round(head(my.ci, 14), 2)## OR lci uci d
## (Intercept) 0.03 0.02 0.03 2.01
## mean_ssep2 0.99 0.99 0.99 0.00
## parity_cat(1,2] 0.51 0.49 0.54 0.37
## parity_cat(2,3] 0.44 0.41 0.47 0.46
## parity_cat(3,20] 0.45 0.40 0.51 0.44
## Urban3 1.01 0.97 1.05 0.00
## LanguageFrench 1.36 1.30 1.43 0.17
## LanguageItalian 1.15 1.04 1.28 0.08
## mother_nationality_cat2Africa 1.10 0.99 1.23 0.05
## mother_nationality_cat2Asia 0.93 0.84 1.02 0.04
## mother_nationality_cat2Europe 0.83 0.79 0.86 0.10
## mother_nationality_cat2Northern America 0.61 0.45 0.82 0.27
## mother_nationality_cat2Southern and Central America 0.76 0.65 0.88 0.15
## s(GA_weeks).1 0.00 0.00 963.92 5.99
Macrosomia vs “Normal” birthweight
2007-2020
m_macro_normal_bam <-bam(macro_norm~s(GA_weeks)+ s(mat_age) + s(birth_Y_M_num) + s(month, bs="cc") +s(mean_ssep2) + parity_cat + Urban3 + Language + mother_nationality_cat2,family="binomial", data=bevn_eco_in7)
summary(m_macro_normal_bam)##
## Family: binomial
## Link function: logit
##
## Formula:
## macro_norm ~ s(GA_weeks) + s(mat_age) + s(birth_Y_M_num) + s(month,
## bs = "cc") + s(mean_ssep2) + parity_cat + Urban3 + Language +
## mother_nationality_cat2
##
## Parametric coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -2.994859 0.008154 -367.273 < 2e-16 ***
## parity_cat(1,2] 0.584826 0.008005 73.060 < 2e-16 ***
## parity_cat(2,3] 0.847783 0.011005 77.035 < 2e-16 ***
## parity_cat(3,20] 0.990751 0.017605 56.275 < 2e-16 ***
## Urban3 0.012021 0.007387 1.627 0.104
## LanguageFrench -0.268103 0.009036 -29.671 < 2e-16 ***
## LanguageItalian -0.375315 0.022312 -16.821 < 2e-16 ***
## mother_nationality_cat2Africa 0.048424 0.020474 2.365 0.018 *
## mother_nationality_cat2Asia -0.107983 0.021358 -5.056 4.28e-07 ***
## mother_nationality_cat2Europe 0.167296 0.007817 21.403 < 2e-16 ***
## mother_nationality_cat2Northern America 0.355488 0.041791 8.506 < 2e-16 ***
## mother_nationality_cat2Southern and Central America 0.144267 0.027725 5.204 1.96e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df Chi.sq p-value
## s(GA_weeks) 7.541 8.211 32538.81 < 2e-16 ***
## s(mat_age) 4.530 5.468 15.04 0.01459 *
## s(birth_Y_M_num) 6.528 7.665 87.34 < 2e-16 ***
## s(month) 2.373 8.000 10.91 0.00256 **
## s(mean_ssep2) 6.521 7.689 59.69 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.0442 Deviance explained = 7.36%
## fREML = 1.6044e+06 Scale est. = 1 n = 1131761
plot(m_macro_normal_bam, trans = plogis, select=1, ylim=c(0, 0.6), shift = coef(m_macro_normal_bam)[1], shade = TRUE, shade.col = "lightblue") #GA_weeks plot(m_macro_normal_bam, trans = plogis, select=2, ylim=c(0.03, 0.06), shift = coef(m_macro_normal_bam)[1], shade = TRUE, shade.col = "lightblue") #mat_age plot(m_macro_normal_bam, trans = plogis, select=3, ylim=c(0.045, 0.055), shift = coef(m_macro_normal_bam)[1], shade = TRUE, shade.col = "lightblue", xlab = "1 : 2007-01, 25: 2009-01, 50: 2011-02, 75: 2013-03, 100: 2015-04, 125: 2017-05, 150: 2019-06") #birth_Y_M_num plot(m_macro_normal_bam, trans = plogis, select=4, ylim=c(0.045, 0.052), shift = coef(m_macro_normal_bam)[1], shade = TRUE, shade.col = "lightblue") #month plot(m_macro_normal_bam, trans = plogis, select=5, ylim=c(0.02, 0.06), shift = coef(m_macro_normal_bam)[1], shade = TRUE, shade.col = "lightblue") #ssep k.check(m_macro_normal_bam)## k' edf k-index p-value
## s(GA_weeks) 9 7.541344 0.9507982 0.4625
## s(mat_age) 9 4.530288 0.9287874 0.0550
## s(birth_Y_M_num) 9 6.528365 0.9416723 0.2225
## s(month) 8 2.372616 0.9220899 0.0125
## s(mean_ssep2) 9 6.520942 0.9295978 0.0625
concurvity(m_macro_normal_bam)## para s(GA_weeks) s(mat_age) s(birth_Y_M_num) s(month) s(mean_ssep2)
## worst 0.7577101 0.01875457 0.12424820 0.01940922 0.015314275 0.17847675
## observed 0.7577101 0.01355338 0.02404706 0.01238053 0.003513315 0.02299481
## estimate 0.7577101 0.01169164 0.07994586 0.01109062 0.001900336 0.08204614
#to calculate the OR of Macrosomia vs normal BW
OR <- exp(coef(m_macro_normal_bam))
beta <- coef(m_macro_normal_bam)
Vb <- vcov(m_macro_normal_bam, unconditional = TRUE)
se <- sqrt(diag(Vb))
lci <- exp(beta-1.96*se)
uci <- exp(beta+1.96*se)
d <- abs(beta)/1.81
my.ci <- data.frame(cbind(OR, lci, uci, d))
round(head(my.ci, 13), 2)## OR lci uci d
## (Intercept) 0.05 0.05 0.05 1.65
## parity_cat(1,2] 1.79 1.77 1.82 0.32
## parity_cat(2,3] 2.33 2.28 2.39 0.47
## parity_cat(3,20] 2.69 2.60 2.79 0.55
## Urban3 1.01 1.00 1.03 0.01
## LanguageFrench 0.76 0.75 0.78 0.15
## LanguageItalian 0.69 0.66 0.72 0.21
## mother_nationality_cat2Africa 1.05 1.01 1.09 0.03
## mother_nationality_cat2Asia 0.90 0.86 0.94 0.06
## mother_nationality_cat2Europe 1.18 1.16 1.20 0.09
## mother_nationality_cat2Northern America 1.43 1.31 1.55 0.20
## mother_nationality_cat2Southern and Central America 1.16 1.09 1.22 0.08
## s(GA_weeks).1 24.33 0.06 9681.62 1.76
2008-2010
We set mat age and birth_Y_M_num variables as linear (edf= 1.005 and 1.378, respectively)
m_macro_normal_bam_2009_2 <-bam(macro_norm~s(GA_weeks)+ s(mean_ssep2) + mat_age + birth_Y_M_num +parity_cat + Urban3 + Language + mother_nationality_cat2,family="binomial", data=bevn_eco_in7_2008_10)
summary(m_macro_normal_bam_2009_2)##
## Family: binomial
## Link function: logit
##
## Formula:
## macro_norm ~ s(GA_weeks) + s(mean_ssep2) + mat_age + birth_Y_M_num +
## parity_cat + Urban3 + Language + mother_nationality_cat2
##
## Parametric coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.0556170 0.0594475 -51.400 < 2e-16 ***
## mat_age 0.0033176 0.0017214 1.927 0.05395 .
## birth_Y_M_num -0.0009099 0.0007786 -1.169 0.24257
## parity_cat(1,2] 0.5523156 0.0184479 29.939 < 2e-16 ***
## parity_cat(2,3] 0.8408195 0.0252899 33.247 < 2e-16 ***
## parity_cat(3,20] 0.9770053 0.0400311 24.406 < 2e-16 ***
## Urban3 0.0146125 0.0169097 0.864 0.38751
## LanguageFrench -0.2531999 0.0210700 -12.017 < 2e-16 ***
## LanguageItalian -0.3253135 0.0478590 -6.797 1.07e-11 ***
## mother_nationality_cat2Africa 0.3451273 0.0536331 6.435 1.23e-10 ***
## mother_nationality_cat2Asia -0.1519379 0.0536799 -2.830 0.00465 **
## mother_nationality_cat2Europe 0.1924359 0.0182083 10.569 < 2e-16 ***
## mother_nationality_cat2Northern America 0.2988927 0.0954095 3.133 0.00173 **
## mother_nationality_cat2Southern and Central America 0.2612904 0.0577743 4.523 6.11e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df Chi.sq p-value
## s(GA_weeks) 6.587 7.465 6407.742 <2e-16 ***
## s(mean_ssep2) 2.379 3.061 6.143 0.103
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.047 Deviance explained = 7.81%
## fREML = 2.9678e+05 Scale est. = 1 n = 209707
plot(m_macro_normal_bam_2009_2, trans = plogis, select=1, ylim=c(0, 0.8), shift = coef(m_macro_normal_bam_2009_2)[1], shade = TRUE, shade.col = "lightblue") #GA_weeks plot(m_macro_normal_bam_2009_2, trans = plogis, select=2, ylim=c(0.03, 0.06), shift = coef(m_macro_normal_bam_2009_2)[1], shade = TRUE, shade.col = "lightblue") #ssep k.check(m_macro_normal_bam_2009_2)## k' edf k-index p-value
## s(GA_weeks) 9 6.586877 0.9209356 0.105
## s(mean_ssep2) 9 2.379232 0.9546030 0.910
concurvity(m_macro_normal_bam_2009_2)## para s(GA_weeks) s(mean_ssep2)
## worst 0.9806178 0.01627484 0.17849470
## observed 0.9806178 0.01149629 0.16046673
## estimate 0.9806178 0.01129053 0.08121233
#to calculate the OR of Macrosomia vs normal BW
OR <- exp(coef(m_macro_normal_bam_2009_2))
beta <- coef(m_macro_normal_bam_2009_2)
Vb <- vcov(m_macro_normal_bam_2009_2, unconditional = TRUE)
se <- sqrt(diag(Vb))
lci <- exp(beta-1.96*se)
uci <- exp(beta+1.96*se)
d <- abs(beta)/1.81
my.ci <- data.frame(cbind(OR, lci, uci, d))
round(head(my.ci, 15), 2)## OR lci uci d
## (Intercept) 0.05 0.04 0.05 1.69
## mat_age 1.00 1.00 1.01 0.00
## birth_Y_M_num 1.00 1.00 1.00 0.00
## parity_cat(1,2] 1.74 1.68 1.80 0.31
## parity_cat(2,3] 2.32 2.21 2.44 0.46
## parity_cat(3,20] 2.66 2.46 2.87 0.54
## Urban3 1.01 0.98 1.05 0.01
## LanguageFrench 0.78 0.74 0.81 0.14
## LanguageItalian 0.72 0.66 0.79 0.18
## mother_nationality_cat2Africa 1.41 1.27 1.57 0.19
## mother_nationality_cat2Asia 0.86 0.77 0.95 0.08
## mother_nationality_cat2Europe 1.21 1.17 1.26 0.11
## mother_nationality_cat2Northern America 1.35 1.12 1.63 0.17
## mother_nationality_cat2Southern and Central America 1.30 1.16 1.45 0.14
## s(GA_weeks).1 24.81 0.01 61507.45 1.77
2017-2020
m_macro_normal_bam_covid <-bam(macro_norm~s(GA_weeks)+ s(mat_age) + s(birth_Y_M_num) +s(mean_ssep2) + parity_cat + Urban3 + Language + mother_nationality_cat2,family="binomial", data=bevn_eco_in7_2017_20)
summary(m_macro_normal_bam_covid)##
## Family: binomial
## Link function: logit
##
## Formula:
## macro_norm ~ s(GA_weeks) + s(mat_age) + s(birth_Y_M_num) + s(mean_ssep2) +
## parity_cat + Urban3 + Language + mother_nationality_cat2
##
## Parametric coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -2.96652 0.01363 -217.656 < 2e-16 ***
## parity_cat(1,2] 0.61521 0.01349 45.596 < 2e-16 ***
## parity_cat(2,3] 0.88561 0.01844 48.031 < 2e-16 ***
## parity_cat(3,20] 1.02155 0.02980 34.285 < 2e-16 ***
## Urban3 0.00215 0.01238 0.174 0.86214
## LanguageFrench -0.28362 0.01494 -18.990 < 2e-16 ***
## LanguageItalian -0.41848 0.04032 -10.380 < 2e-16 ***
## mother_nationality_cat2Africa -0.06377 0.03222 -1.979 0.04778 *
## mother_nationality_cat2Asia -0.10459 0.03419 -3.059 0.00222 **
## mother_nationality_cat2Europe 0.12629 0.01312 9.626 < 2e-16 ***
## mother_nationality_cat2Northern America 0.36303 0.07316 4.962 6.97e-07 ***
## mother_nationality_cat2Southern and Central America 0.10826 0.05059 2.140 0.03236 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df Chi.sq p-value
## s(GA_weeks) 6.571 7.417 11000.26 < 2e-16 ***
## s(mat_age) 3.929 4.817 30.78 1.50e-05 ***
## s(birth_Y_M_num) 5.391 6.528 28.57 9.36e-05 ***
## s(mean_ssep2) 4.500 5.604 46.80 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.0426 Deviance explained = 7.05%
## fREML = 5.6818e+05 Scale est. = 1 n = 400580
plot(m_macro_normal_bam_covid, trans = plogis, select=1, ylim=c(0, 0.7), shift = coef(m_macro_normal_bam_covid)[1], shade = TRUE, shade.col = "lightblue") #GA_weeks plot(m_macro_normal_bam_covid, trans = plogis, select=2, ylim=c(0.02, 0.08), shift = coef(m_macro_normal_bam_covid)[1], shade = TRUE, shade.col = "lightblue") #mat_age plot(m_macro_normal_bam_covid, trans = plogis, select=3, ylim=c(0.04, 0.06), shift = coef(m_macro_normal_bam_covid)[1], shade = TRUE, shade.col = "lightblue", xlab = "120 : 2016-12, 130: 2017-10, 140: 2018-08, 150: 2019-06, 160: 2020-04") #birth_Y_M_num plot(m_macro_normal_bam_covid, trans = plogis, select=4, ylim=c(0.02, 0.06), shift = coef(m_macro_normal_bam_covid)[1], shade = TRUE, shade.col = "lightblue") #ssep k.check(m_macro_normal_bam_covid)## k' edf k-index p-value
## s(GA_weeks) 9 6.571220 0.9276286 0.0125
## s(mat_age) 9 3.928893 0.9485468 0.3250
## s(birth_Y_M_num) 9 5.391057 0.9557668 0.5200
## s(mean_ssep2) 9 4.499786 0.9518133 0.3800
concurvity(m_macro_normal_bam_covid)## para s(GA_weeks) s(mat_age) s(birth_Y_M_num) s(mean_ssep2)
## worst 0.7599644 0.02070321 0.10901624 0.0021237106 0.17658129
## observed 0.7599644 0.01520515 0.07600011 0.0001499351 0.01367128
## estimate 0.7599644 0.01329586 0.07180917 0.0016809370 0.07916320
#to calculate the OR of Macrosomia vs normal BW
OR <- exp(coef(m_macro_normal_bam_covid))
beta <- coef(m_macro_normal_bam_covid)
Vb <- vcov(m_macro_normal_bam_covid, unconditional = TRUE)
se <- sqrt(diag(Vb))
lci <- exp(beta-1.96*se)
uci <- exp(beta+1.96*se)
d <- abs(beta)/1.81
my.ci <- data.frame(cbind(OR, lci, uci, d))
round(head(my.ci, 13), 2)## OR lci uci d
## (Intercept) 0.05 0.05 0.05 1.64
## parity_cat(1,2] 1.85 1.80 1.90 0.34
## parity_cat(2,3] 2.42 2.34 2.51 0.49
## parity_cat(3,20] 2.78 2.62 2.94 0.56
## Urban3 1.00 0.98 1.03 0.00
## LanguageFrench 0.75 0.73 0.78 0.16
## LanguageItalian 0.66 0.61 0.71 0.23
## mother_nationality_cat2Africa 0.94 0.88 1.00 0.04
## mother_nationality_cat2Asia 0.90 0.84 0.96 0.06
## mother_nationality_cat2Europe 1.13 1.11 1.16 0.07
## mother_nationality_cat2Northern America 1.44 1.25 1.66 0.20
## mother_nationality_cat2Southern and Central America 1.11 1.01 1.23 0.06
## s(GA_weeks).1 2.50 0.01 1031.72 0.51